openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842024-03-20T08:01:51ZopenSUSE Project Management Tool
Redmine openQA Project - action #157576 (Resolved): Cropped openQA version string on bottom of page if cu...https://progress.opensuse.org/issues/1575762024-03-20T08:01:51Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>See the cropped version string on the bottom despite enough white space available:</p>
<p><img src="https://progress.opensuse.org/attachments/download/17461/Screenshot_20240320_084634_openqa_cropped_version_string.png" alt="Screenshot_20240320_084634_openqa_cropped_version_string.png" loading="lazy" /></p>
<p>The full version string is readable by scrolling down but the scroll bar should not be necessary</p>
<a name="Steps-to-reproduce"></a>
<h2 >Steps to reproduce<a href="#Steps-to-reproduce" class="wiki-anchor">¶</a></h2>
<ul>
<li>Visit an openQA page using a custom links_footer_left and/or links_footer_right, e.g. <a href="https://openqa.suse.de/tests/overview?version=12-SP5&groupid=427&flavor=Azure-Standard-Updates&distri=sle&build=20240318-1" class="external">https://openqa.suse.de/tests/overview?version=12-SP5&groupid=427&flavor=Azure-Standard-Updates&distri=sle&build=20240318-1</a></li>
</ul>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<ul>
<li>No cropping and no scroll bar should appear if there is enough blank space</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li> Try to solve that with some CSS in general related to <a href="https://github.com/os-autoinst/openQA/blob/cd53217ffe9127ebe8fa697e4c133836a3be36e6/templates/webapi/layouts/bootstrap.html.ep#L62" class="external">https://github.com/os-autoinst/openQA/blob/cd53217ffe9127ebe8fa697e4c133836a3be36e6/templates/webapi/layouts/bootstrap.html.ep#L62</a> . If not possible try to solve it within <a href="https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/openqa/links_footer_left.html?ref_type=heads" class="external">https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/openqa/links_footer_left.html?ref_type=heads</a> or <a href="https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/openqa/links_footer_right.html?ref_type=heads" class="external">https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/openqa/links_footer_right.html?ref_type=heads</a></li>
<li>Just open <a href="https://openqa.suse.de/tests/overview?version=12-SP5&groupid=427&flavor=Azure-Standard-Updates&distri=sle&build=20240318-1" class="external">https://openqa.suse.de/tests/overview?version=12-SP5&groupid=427&flavor=Azure-Standard-Updates&distri=sle&build=20240318-1</a> in your browser and try to fix the display with the browser developer tools and then find the right spot in our code where to apply the style, e.g. in our CSS files or html.ep templates directly</li>
</ul>
openQA Project - action #157543 (Resolved): [sporadic] ci openQA: t/ui/23-audit-log.t fails size:Mhttps://progress.opensuse.org/issues/1575432024-03-19T14:16:41Ztinitatina.mueller+trick-redmine@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p><a href="https://app.circleci.com/pipelines/github/os-autoinst/openQA/13196/workflows/ddb935c7-31dd-4beb-877c-25ef1e703b4d/jobs/123228" class="external">https://app.circleci.com/pipelines/github/os-autoinst/openQA/13196/workflows/ddb935c7-31dd-4beb-877c-25ef1e703b4d/jobs/123228</a></p>
<pre><code>[14:08:28] t/ui/23-audit-log.t ........................ 12/?
# Failed test 'most rows filtered out when searching for table create events'
# at t/ui/23-audit-log.t line 40.
# got: '8'
# expected: '3'
# Looks like you failed 1 test of 22.
[14:08:28] t/ui/23-audit-log.t ........................ 13/?
# Failed test 'clickable events'
# at t/ui/23-audit-log.t line 152.
[14:08:28] t/ui/23-audit-log.t ........................ 14/? # Looks like you failed 1 test of 14.
[14:08:28] t/ui/23-audit-log.t ........................ Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/14 subtests
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> Statistically significant stable test execution of t/ui/23-audit-log.t</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>DONE: <del>Find out current error rate locally</del> not reproducible</li>
<li>Find out current error rate locally with coverage enabled as this is possibly more likely to reproduce problems we see in circleCI</li>
<li>Consider recent javascript stack related updates which might impact that</li>
<li>Identify the specific point of sporadic failure source by debugging the unit test and executed code itself</li>
<li>Apply changes to the test code to make it more robust. Possibly similar as in other UI tests in the past with some means of synchronization</li>
<li>Verify that the test is stable again</li>
</ul>
openQA Project - action #157534 (Resolved): Multi-Machine Job fails in suseconnect_scc due to wor...https://progress.opensuse.org/issues/1575342024-03-19T14:06:49Zacarvajalacarvajal@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-SP5-Server-DVD-HA-Incidents-x86_64-qam_ha_rolling_update_node01@64bit fails in<br>
<a href="https://openqa.suse.de/tests/13823482/modules/suseconnect_scc/steps/25" class="external">suseconnect_scc</a></p>
<p>It fails while attempting to call <code>script_output</code> which does a <code>curl</code> command to 10.0.2.2 to download the script.</p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Testsuite maintained at <a href="https://gitlab.suse.de/qa-maintenance/qam-openqa-yml" class="external">https://gitlab.suse.de/qa-maintenance/qam-openqa-yml</a>.</p>
<p><code>rolling_update</code> tests take a working cluster and performs a migration in each node while the node is in maintenance.</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails sporadically since (at least) Build <a href="https://openqa.suse.de/tests/13823340" class="external">:32868:expat</a></p>
<p>Majority of the failures have been seen in worker40.</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/13822172" class="external">:32996:sed</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Server-DVD-HA-Incidents&machine=64bit&test=qam_ha_rolling_update_node01&version=15-SP5" class="external">latest</a></p>
openQA Project - action #157018 (Resolved): [sporadic] Build failed in Jenkins: submit-openQA-TW-...https://progress.opensuse.org/issues/1570182024-03-11T10:27:19Ztinitatina.mueller+trick-redmine@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<pre><code>Date: Sat, 9 Mar 2024 03:49:48 +0100 (CET)
See <http://jenkins.qa.suse.de/job/submit-openQA-TW-to-oS_Fctry/1001/display/redirect>
Changes:
------------------------------------------
[...truncated 4.20 MiB...]
<result project="devel:openQA:tested" repository="openSUSE_Factory" arch="x86_64" code="blocked" state="blocked">
<status package="openQA" code="blocked">
+ echo 'Waiting while openQA is in progress'
Waiting while openQA is in progress
...
4.6.1709822711.90519fe6
4.6.1709822711.90519fe6
4.6.1709822711.90519fe6
4.6.1709822711.90519fe6' openSUSE:Factory
Server returned an error: HTTP Error 503: Service Unavailable
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> Short unavailabilities of OBS are covered with retry</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Use <a href="https://build.opensuse.org/package/show/openSUSE:Factory/retry" class="external">https://build.opensuse.org/package/show/openSUSE:Factory/retry</a> in the according script from github.com/os-autoinst/scripts/</li>
</ul>
openQA Project - action #156769 (Resolved): openQA nightly documentation build CI jobs fail with ...https://progress.opensuse.org/issues/1567692024-03-06T13:56:17Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p><a href="https://app.circleci.com/pipelines/github/os-autoinst/openQA/13086/workflows/c5beef2e-d0f1-4516-93cd-552ea903e83f/jobs/122001" class="external">https://app.circleci.com/pipelines/github/os-autoinst/openQA/13086/workflows/c5beef2e-d0f1-4516-93cd-552ea903e83f/jobs/122001</a></p>
<pre><code>Building native extensions. This could take a while...
ERROR: Error installing asciidoctor-pdf:
ERROR: Failed to build gem native extension.
current directory: /home/squamata/project/.gem/gems/bigdecimal-3.1.6/ext/bigdecimal
/usr/bin/ruby.ruby2.5 -r ./siteconf20240306-923-t3r9gw.rb extconf.rb
mkmf.rb can't find header files for ruby at /usr/lib64/ruby/include/ruby.h
extconf failed, exit code 1
Gem files will remain installed in /home/squamata/project/.gem/gems/bigdecimal-3.1.6 for inspection.
Results logged to /home/squamata/project/.gem/extensions/x86_64-linux/2.5.0/bigdecimal-3.1.6/gem_make.out
Exited with code exit status 1
</code></pre>
<p>I think I have seen that reproducing over the past days</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Check where ruby2.5 is pulled in which is possibly outdated and ensure to use a current version</li>
<li>Consider quick fixes or dropping PDF support at all but also keep in mind references in "gh-pages" itself</li>
<li>Consider updating the stack but then the base of Leap is a problem, so maybe switch to a different container base, e.g. Tumbleweed or use pandoc or something else to generate the PDF or headless web-browser</li>
</ul>
openQA Project - action #156625 (Resolved): [alert] Scripts CI pipeline failing due to osd yieldi...https://progress.opensuse.org/issues/1566252024-03-05T07:56:08Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>After <a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" title="action: [alert] Scripts CI pipeline failing after logging multiple Job state of job ID 13603796: running... (Resolved)" href="https://progress.opensuse.org/issues/156052">#156052</a> we still have a case <a href="https://gitlab.suse.de/openqa/scripts-ci/-/jobs/2344558" class="external">https://gitlab.suse.de/openqa/scripts-ci/-/jobs/2344558</a> like this:</p>
<pre><code>Job state of job ID 13715326: running, waiting …
{"blocked_by_id":null,"id":13715326,"result":"none","state":"running"}
Job state of job ID 13715326: running, waiting …
Request failed, hit error 503, retrying up to 60 more times after waiting …
…
Request failed, hit error 503, retrying up to 1 more times after waiting …
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
<p>Additionally, a 503 Service Unavailable
error was encountered while trying to use an ErrorDocument to handle the request.</p>
<hr>
<address>Apache/2.4.51 (Linux/SUSE) Server at openqa.suse.de Port 80</address>
</body></html>
</code></pre>
<p>that's possibly a retry over multiple minutes but still something is off here.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> (vague) openqa-cli waits sufficiently long to cover usual OSD outages</li>
<li><strong>AC2:</strong> The retry-functionality in openqa-cli was double-verified and works as intended</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Test the openqa-cli behaviour maybe together with an apache proxy on a local installation</li>
<li>Check if the retry actually properly sleeps in between</li>
<li>Consider adding exponential backup into openqa-cli, see <a href="https://github.com/okurz/retry/blob/main/retry#L49" class="external">https://github.com/okurz/retry/blob/main/retry#L49</a></li>
<li>Consider adding a timestamp to the gitlab CI pipeline output</li>
<li>Consider output the value of <code>OPENQA_CLI_RETRY_SLEEP_TIME_S</code> in the <code>Request failed, hit error ..., retrying up to ... more times after waiting</code> line</li>
</ul>
openQA Project - action #156394 (Resolved): [tools] some Automatic investigation jobs for job 136...https://progress.opensuse.org/issues/1563942024-03-01T01:53:35Zrfan1richard.fan@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Hello tools team experts, <a href="https://openqa.suse.de/tests/13642310#comments" class="external">https://openqa.suse.de/tests/13642310#comments</a></p>
<p>Based on the test comments, Automatic investigation jobs for job 13642310 are passed, but NOT all test modules are scheduled.</p>
<p>The the failed job is failed at <code>system_prepare</code>, however the Automatic investigation jobs stop at <code>first_boot</code></p>
<p>Can you please help check? thanks!</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> openqa-investigate jobs run more or less the original test module schedule w/o publishing assets</li>
<li><strong>AC2:</strong> os-autoinst+openQA must continue to be ignorant of os-autoinst/scripts specifics</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Use a different variable value for the publish variables and adapt the handling in os-autoinst+openQA accordingly to accept a value like "none" to not impact the schedule but still not publish anything and/or adapt the test code accordingly</li>
<li>Ensure that still nothing is published</li>
</ul>
openQA Infrastructure - action #156016 (Resolved): [openQA][sle-micro][virtualization] Test slem_...https://progress.opensuse.org/issues/1560162024-02-26T06:38:46Zwaynechen55wchen@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Test run <a href="https://openqa.suse.de/tests/13602228#step/firstrun/8" class="external">slem_virtualization@uefi</a> keeps failing due to missing passphrase.</p>
<p>I found there is no</p>
<pre><code>_SECRET_AD_DOMAIN_PASSWORD "[redacted]"
</code></pre>
<p>in vars.json</p>
<a name="Steps-to-reproduce"></a>
<h2 >Steps to reproduce<a href="#Steps-to-reproduce" class="wiki-anchor">¶</a></h2>
<ul>
<li>Look up in vars.json of failed test run</li>
</ul>
<a name="Impact"></a>
<h2 >Impact<a href="#Impact" class="wiki-anchor">¶</a></h2>
<p>Virtualization test run with encrypted image failed due to missing passphrase.</p>
<a name="Problem"></a>
<h2 >Problem<a href="#Problem" class="wiki-anchor">¶</a></h2>
<p>Missing setting</p>
<pre><code>_SECRET_AD_DOMAIN_PASSWORD "[redacted]"
</code></pre>
<p>in triggered test run.</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Check how relevant test is triggered</li>
<li>Check any missing settings</li>
<li>Retrigger relevant test run</li>
</ul>
<a name="Workaround"></a>
<h2 >Workaround<a href="#Workaround" class="wiki-anchor">¶</a></h2>
<p>n/a</p>
openQA Project - action #155716 (Resolved): [alert] openqa-worker-cacheservice fails to start on ...https://progress.opensuse.org/issues/1557162024-02-21T08:27:46Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>From <a href="https://monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services?orgId=1" class="external">https://monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services?orgId=1</a></p>
<p><code>ssh worker29.oqa.prg2.suse.org "journalctl -u openqa-worker-cacheservice"</code> says</p>
<pre><code>Feb 21 09:25:43 worker29 openqa-workercache-daemon[86009]: [86009] [e] Database has been corrupted: DBD::SQLite::db commit failed: disk I/O error at /u>
Feb 21 09:25:43 worker29 openqa-workercache-daemon[86009]: [86009] [e] Killing processes accessing the database file handles and removing database
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> Cache service on worker29 works again</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li><em>DONE</em> Add silence(s)</li>
<li>Gather logs helpful for debugging especially before the machine is rebooted</li>
<li>Maybe ext2 is just unreliable -> yes, it is. A reboot of the machine already fixed the problem because we recreate the filesystem automatically</li>
<li>Create another ticket for the related fallout of the reboot triggered problem</li>
</ul>
<a name="Rollback-actions"></a>
<h2 >Rollback actions<a href="#Rollback-actions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Remove silence <code>alertname=Failed systemd services alert</code> from <a href="https://monitor.qa.suse.de/alerting/silences" class="external">https://monitor.qa.suse.de/alerting/silences</a></li>
<li>Remove silence <code>alertname=Broken workers alert</code> from <a href="https://monitor.qa.suse.de/alerting/silences" class="external">https://monitor.qa.suse.de/alerting/silences</a></li>
</ul>
<a name="Out-of-scope"></a>
<h2 >Out of scope<a href="#Out-of-scope" class="wiki-anchor">¶</a></h2>
<ul>
<li>Using another filesystem</li>
</ul>
openQA Project - action #155173 (Resolved): [openqa-in-openqa] [sporadic] test fails in openqa_wo...https://progress.opensuse.org/issues/1551732024-02-08T09:25:36Ztinitatina.mueller+trick-redmine@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario openqa-Tumbleweed-dev-x86_64-openqa_install_nginx@64bit-2G fails in<br>
<a href="https://openqa.opensuse.org/tests/3922710/modules/openqa_worker/steps/9" class="external">openqa_worker</a></p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.opensuse.org/tests/3922710" class="external">:TW.26398</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.opensuse.org/tests/3922125" class="external">:TW.26397</a> (or more recent)</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Lookup older tickets and add as reference about adding os-autoinst-setup-multi-machine to openQA-in-openQA tests</li>
<li>Try to reproduce and fix or simply apply a mitigation as applicable, e.g. increase timeout or retry or something</li>
<li>The proper place to fix might be in the test code but could also be in os-autoinst-setup-multi-machine itself or even further low-level</li>
</ul>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=openqa&flavor=dev&machine=64bit-2G&test=openqa_install_nginx&version=Tumbleweed" class="external">latest</a></p>
openQA Project - action #155170 (Resolved): [openqa-in-openqa] [sporadic] test fails in test_runn...https://progress.opensuse.org/issues/1551702024-02-08T09:24:12Ztinitatina.mueller+trick-redmine@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario openqa-Tumbleweed-dev-x86_64-openqa_install_multimachine@64bit-4G fails in<br>
<a href="https://openqa.opensuse.org/tests/3923320/modules/test_running/steps/5" class="external">test_running</a>.</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.opensuse.org/tests/3923320" class="external">:TW.26399</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.opensuse.org/tests/3922709" class="external">:TW.26398</a> (or more recent)</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Take a look into <a href="https://openqa.opensuse.org/tests/3923320/file/test_running-mm_testresults.txz" class="external">https://openqa.opensuse.org/tests/3923320/file/test_running-mm_testresults.txz</a></li>
<li>Apply same steps as in <a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" title="action: [openqa-in-openqa] [sporadic] test fails in openqa_worker: os-autoinst-setup-multi-machine timed ... (Resolved)" href="https://progress.opensuse.org/issues/155173">#155173</a> but at a slightly different code location</li>
<li>Consider if this issue is actually the same as <a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" title="action: [openqa-in-openqa] [sporadic] test fails in openqa_worker: os-autoinst-setup-multi-machine timed ... (Resolved)" href="https://progress.opensuse.org/issues/155173">#155173</a></li>
<li><em>DONE</em> Investigate if the error message from dmesg <code>Failed to associated timeout policy</code>ovs_test_tp'` could be related to the failure no, also happens in passed runs</li>
</ul>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=openqa&flavor=dev&machine=64bit-4G&test=openqa_install_multimachine&version=Tumbleweed" class="external">latest</a></p>
openQA Project - action #155068 (Resolved): [sporadic] failing openQA unit test t/ui/16-tests_job...https://progress.opensuse.org/issues/1550682024-02-07T10:30:30Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>From <a href="https://app.circleci.com/pipelines/github/os-autoinst/openQA/12918/workflows/7aa1aa51-813d-483e-b8bf-c5a5cac46d95/jobs/120429" class="external">https://app.circleci.com/pipelines/github/os-autoinst/openQA/12918/workflows/7aa1aa51-813d-483e-b8bf-c5a5cac46d95/jobs/120429</a> , first time occurrence.</p>
openQA Project - action #154816 (Resolved): "Mojo::Reactor::Poll: Timer failed: Invalid character...https://progress.opensuse.org/issues/1548162024-02-02T12:47:51Zhsehic
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Error (full log attached):<br>
Mojo::Reactor::Poll: Timer failed: Invalid characters in X-API-Key header at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/Headers.pm line 38.<br>
[...]<br>
[error] Stopping because a critical error occurred.<br>
[error] Another error occurred when trying to stop gracefully due to an error<br>
[error] Trying to kill ourself forcefully now</p>
<a name="Steps-to-reproduce"></a>
<h2 >Steps to reproduce<a href="#Steps-to-reproduce" class="wiki-anchor">¶</a></h2>
<p>Host OS SLEMicro 5.5 (GM) on kvm<br>
transactional-update</p>
<pre><code>podman pull --tls-verify=false registry.suse.de/home/okurz/bci/15.5/containers_backports_updates/suse/openqa-single-instance:latest
podman run <container>
</code></pre> openQA Project - action #154552 (Resolved): [ppc64le] test fails in iscsi_client - zypper reports...https://progress.opensuse.org/issues/1545522024-01-30T13:22:19Zacarvajalacarvajal@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-SP6-Online-ppc64le-SAPHanaSR_ScaleUp_PerfOpt_WMP_node01@ppc64le-sap fails in<br>
<a href="https://openqa.suse.de/tests/13381519/modules/iscsi_client/steps/9" class="external">iscsi_client</a></p>
<p>Other MM jobs in ppc64le in the job group also failed:</p>
<p><a href="https://openqa.suse.de/tests/13381522#step/iscsi_client/9" class="external">https://openqa.suse.de/tests/13381522#step/iscsi_client/9</a></p>
<p>But failure seems to be limited to ppc64le as equivalent x86_64 jobs cleared this step:</p>
<p><a href="https://openqa.suse.de/tests/13382300" class="external">https://openqa.suse.de/tests/13382300</a> & <a href="https://openqa.suse.de/tests/13382301" class="external">https://openqa.suse.de/tests/13382301</a><br>
<a href="https://openqa.suse.de/tests/13382303" class="external">https://openqa.suse.de/tests/13382303</a> & <a href="https://openqa.suse.de/tests/13382304" class="external">https://openqa.suse.de/tests/13382304</a></p>
<p>(Those fail later in an unrelated bsc#)</p>
<p>Recommendation is to investigate if something changed or if there is something wrong on qemu_ppc64le-large-mem workers, as HA jobs in the same build in ppc64le were able to clear that test module and in some cases pass completely:</p>
<p>Alpha Cluster: <a href="https://openqa.suse.de/tests/13364670" class="external">https://openqa.suse.de/tests/13364670</a> & <a href="https://openqa.suse.de/tests/13364672" class="external">https://openqa.suse.de/tests/13364672</a> (passes)<br>
Beta Cluster: <a href="https://openqa.suse.de/tests/13364675" class="external">https://openqa.suse.de/tests/13364675</a> & <a href="https://openqa.suse.de/tests/13364678" class="external">https://openqa.suse.de/tests/13364678</a> (fails later in <code>filesystem</code> module)<br>
(There are other examples in <a href="https://openqa.suse.de/tests/overview?distri=sle&version=15-SP6&build=50.1&groupid=143" class="external">https://openqa.suse.de/tests/overview?distri=sle&version=15-SP6&build=50.1&groupid=143</a>)</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/13374449" class="external">50.1</a></p>
<p>Same test with same build but 3 days ago did not show this issue: <a href="https://openqa.suse.de/tests/13364664" class="external">https://openqa.suse.de/tests/13364664</a></p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=ppc64le&distri=sle&flavor=Online&machine=ppc64le-sap&test=SAPHanaSR_ScaleUp_PerfOpt_WMP_node01&version=15-SP6" class="external">latest</a></p>
openQA Project - action #154021 (Resolved): [alert] Ratio of not restarted multi-machine tests by...https://progress.opensuse.org/issues/1540212024-01-22T09:55:10Ztinitatina.mueller+trick-redmine@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p><a href="https://stats.openqa-monitor.qa.suse.de/alerting/grafana/0XohcmfVk/view?orgId=1" class="external">https://stats.openqa-monitor.qa.suse.de/alerting/grafana/0XohcmfVk/view?orgId=1</a></p>
<pre><code>Date: Sun, 21 Jan 2024 21:31:37 +0100
2 firing alert instances
[IMAGE]
GROUPED BY
2 firing instances
Firing [stats.openqa-monitor.qa.suse.de]
Ratio of not restarted multi-machine tests by result alert
View alert [stats.openqa-monitor.qa.suse.de]
Values
A0=50
Labels
alertname
Ratio of not restarted multi-machine tests by result alert
grafana_folder
Salt
rule_uid
0XohcmfVk
Annotations
message
</code></pre>
<p>Investigation hints:</p>
<ul>
<li>Investigate what caused the ratio to change that significantly</li>
<li>Check <a href="https://openqa.suse.de/tests?resultfilter=Failed" class="external">https://openqa.suse.de/tests?resultfilter=Failed</a> and look for a correlation</li>
<li>Follow
<a href="https://progress.opensuse.org/projects/openqatests/wiki/Wiki#Statistical-investigation" class="external">https://progress.opensuse.org/projects/openqatests/wiki/Wiki#Statistical-investigation</a></li>
<li>Check if the amount of failed jobs stays high for longer or if this was just caused by a single scenario failing as a
whole
See <a href="https://progress.opensuse.org/issues/96191" class="external">https://progress.opensuse.org/issues/96191</a> for details</li>
</ul>