openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842023-11-08T14:03:39ZopenSUSE Project Management Tool
Redmine QA - action #139199 (Resolved): Ensure OSD openQA PowerPC machine redcurrant is operational from ...https://progress.opensuse.org/issues/1391992023-11-08T14:03:39Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Most PowerPC machines are being setup in PRG2 within <a class="issue tracker-4 status-15 priority-3 priority-lowest child" title="action: Support move of PowerPC machines to PRG2 size:M (Blocked)" href="https://progress.opensuse.org/issues/132140">#132140</a> and are at least discoverable from HMC. Now we can setup redcurrant as production openQA PowerVM worker in OSD again.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> redcurrant openQA instances as referenced in <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls</a> are able to pass openQA jobs after the move to PRG2</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Read <a class="issue tracker-4 status-15 priority-3 priority-lowest child" title="action: Support move of PowerPC machines to PRG2 size:M (Blocked)" href="https://progress.opensuse.org/issues/132140">#132140</a> about the generic setup and in particular the HMC</li>
<li>See current infrastructure management network entry <a href="https://racktables.nue.suse.com/index.php?page=object&object_id=11354" class="external">https://racktables.nue.suse.com/index.php?page=object&object_id=11354</a> and child objects about the machine "redcurrant"</li>
<li>Ensure we have access to redcurrant manually as well as with verification openQA jobs, both for osd</li>
<li>Update the infrastructure management network entry, e.g. new FQDN</li>
<li>Inform users about the result</li>
<li>Crosscheck according alert silences on <a href="https://monitor.qa.suse.de/alerting/silences" class="external">https://monitor.qa.suse.de/alerting/silences</a></li>
</ul>
openQA Infrastructure - action #139145 (Resolved): dehydrated on monitor.qe.nue2.suse.org aka. mo...https://progress.opensuse.org/issues/1391452023-11-06T09:17:18Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Failed grafana alert about "Failed systemd services".</p>
<pre><code>root@monitor:~ # systemctl status dehydrated
. dehydrated.service - Certificate Update Runner for Dehydrated
Loaded: loaded (/usr/lib/systemd/system/dehydrated.service; static)
Active: failed (Result: exit-code) since Mon 2023-11-06 00:21:21 CET; 9h ago
TriggeredBy: . dehydrated.timer
Main PID: 19901 (code=exited, status=1/FAILURE)
Nov 06 00:19:10 monitor systemd[1]: Starting Certificate Update Runner for Dehydrated...
Nov 06 00:19:10 monitor dehydrated[19901]: # INFO: Using main config file /etc/dehydrated/config
Nov 06 00:19:10 monitor dehydrated[19901]: # INFO: Using additional config file /etc/dehydrated/config.d/suse-ca.sh
Nov 06 00:19:10 monitor dehydrated[19901]: # INFO: Running /usr/bin/dehydrated as dehydrated/dehydrated
Nov 06 00:19:11 monitor sudo[19901]: root : PWD=/ ; USER=dehydrated ; GROUP=dehydrated ; COMMAND=/usr/bin/dehydrated --cron
Nov 06 00:21:21 monitor dehydrated[19997]: EXPECTED value GOT EOF
Nov 06 00:21:21 monitor systemd[1]: dehydrated.service: Main process exited, code=exited, status=1/FAILURE
Nov 06 00:21:21 monitor systemd[1]: dehydrated.service: Failed with result 'exit-code'.
Nov 06 00:21:21 monitor systemd[1]: Failed to start Certificate Update Runner for Dehydrated.
</code></pre>
<p>as of 2023-11-13 the service is not in state "failed" anymore so apparently a spurious sporadic problem.</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Upstream research for error messages</li>
<li>Consider to change systemd service to restart automatically on failure</li>
</ul>
<a name="Rollback-actions"></a>
<h2 >Rollback actions<a href="#Rollback-actions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Remove silence for "failed systemd services"</li>
</ul>
QA - action #139112 (Resolved): Ensure OSD openQA PowerPC machine grenache is operational from PRG2https://progress.opensuse.org/issues/1391122023-11-04T12:46:49Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Most PowerPC machines are being setup in PRG2 within <a class="issue tracker-4 status-15 priority-3 priority-lowest child" title="action: Support move of PowerPC machines to PRG2 size:M (Blocked)" href="https://progress.opensuse.org/issues/132140">#132140</a> and most machines could be discovered from the HMC but apparently not grenache.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> grenache openQA instances as referenced in <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls</a> are able to pass openQA jobs after the move to PRG2</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Read <a class="issue tracker-4 status-15 priority-3 priority-lowest child" title="action: Support move of PowerPC machines to PRG2 size:M (Blocked)" href="https://progress.opensuse.org/issues/132140">#132140</a> about the generic setup and in particular the HMC</li>
<li>See current infrastructure management network entry <a href="https://racktables.nue.suse.com/index.php?page=object&object_id=3120" class="external">https://racktables.nue.suse.com/index.php?page=object&object_id=3120</a> about the machine "grenache"</li>
<li>Ensure we have access to grenache manually as well as with verification openQA jobs, both for osd</li>
<li>Update the infrastructure management network entry <a href="https://racktables.nue.suse.com/index.php?page=object&object_id=3120" class="external">https://racktables.nue.suse.com/index.php?page=object&object_id=3120</a> accordingly, e.g. new FQDN</li>
<li>Inform users about the result</li>
</ul>
<a name="Rollback-steps"></a>
<h2 >Rollback steps<a href="#Rollback-steps" class="wiki-anchor">¶</a></h2>
<ul>
<li>Add back salt key with <code>salt-key -y -a grenache-1.qa.suse.de</code></li>
</ul>
openQA Infrastructure - action #137600 (Resolved): [alert] Packet loss between worker hosts and o...https://progress.opensuse.org/issues/1376002023-10-09T07:46:02Zjbaier_czjbaier@suse.cz
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>We had multiple occurrences of packet loss alert over the weekend</p>
<pre><code>alertname Packet loss between worker hosts and other hosts alert
grafana_folder Salt
rule_uid 2Z025iB4km
</code></pre>
<p><a href="http://stats.openqa-monitor.qa.suse.de/d/EML0bpuGk?orgId=1&viewPanel=4" class="external">http://stats.openqa-monitor.qa.suse.de/d/EML0bpuGk?orgId=1&viewPanel=4</a></p>
<p>Currently, the problematic ones according to the panel are:</p>
<pre><code>imagetester - walter1.qe.nue2.suse.org 100%
petrol-1 - walter1.qe.nue2.suse.org 100%
sapworker1 - walter1.qe.nue2.suse.org 100%
</code></pre>
<p>That is a little bit weird as I manually checked the first one and it can reach each other well</p>
<pre><code>walter1:~ # ping imagetester.qe.nue2.suse.org
PING imagetester.qe.nue2.suse.org (10.168.192.249) 56(84) bytes of data.
64 bytes from imagetester.qe.nue2.suse.org (10.168.192.249): icmp_seq=7 ttl=64 time=0.326 ms
jbaier@imagetester:~> ping walter1.qe.nue2.suse.org
PING walter1.qe.nue2.suse.org (10.168.192.1) 56(84) bytes of data.
64 bytes from walter1.qe.nue2.suse.org (10.168.192.1): icmp_seq=1 ttl=64 time=0.331 ms
</code></pre>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Confirm <strong>when</strong> this started happening or if it's no longer an issue</li>
<li>There's no paused alerts</li>
</ul>
openQA Infrastructure - action #135632 (Resolved): "Mojo::File::spurt is deprecated in favor of M...https://progress.opensuse.org/issues/1356322023-09-13T06:03:30Zlivdywanliv.dywan@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>See <a href="https://gitlab.suse.de/openqa/osd-deployment/-/jobs/1825493:">https://gitlab.suse.de/openqa/osd-deployment/-/jobs/1825493:</a></p>
<pre><code>++ echo 'Build status for https://build.opensuse.org/project/show/devel:openQA (openSUSE_Leap_15.4) arch x86_64) is not successful'
++ echo '<resultlist state="9c65a2d1b41fa9bf4c35ffd463cddc69">
<result project="devel:openQA" repository="openSUSE_Leap_15.4" arch="x86_64" code="published" state="published">
[...]
<status package="os-autoinst" code="failed"/>
</code></pre>
<p>And accordingly <a href="https://build.opensuse.org/package/live_build_log/devel:openQA/os-autoinst/openSUSE_Leap_15.4/x86_64">https://build.opensuse.org/package/live_build_log/devel:openQA/os-autoinst/openSUSE_Leap_15.4/x86_64</a> - note that there's no persistent logs so I attached the log of the failure:</p>
<pre><code>[19:51:05] ./xt/01-style.t ......................................... fatal: not a git repository (or any of the parent directories): .git
[...]
# Failed test 'no (unexpected) warnings (via done_testing)'
at ./t/03-testapi.t line 1105.
# Got the following unexpected warnings:
# 1: Mojo::File::spurt is deprecated in favor of Mojo::File::spew at /home/abuild/rpmbuild/BUILD/os-autoinst-4.6.1694444383.e6a5294/basetest.pm line 433.
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> os-autoinst builds fine in CI</li>
<li><strong>AC2:</strong> No "fatal" but ignored warning in the output</li>
</ul>
<a name="Added-by-okurz-after-estimation"></a>
<h3 >Added by okurz after estimation<a href="#Added-by-okurz-after-estimation" class="wiki-anchor">¶</a></h3>
<ul>
<li><strong>AC3:</strong> Open SRs for Tumbleweed and MRs for Leap are accepted</li>
<li><strong>AC4:</strong> <a href="https://build.opensuse.org/package/live_build_log/openSUSE:Factory/openQA/standard/x86_64">https://build.opensuse.org/package/live_build_log/openSUSE:Factory/openQA/standard/x86_64</a> passes</li>
<li><strong>AC5:</strong> <a href="https://build.opensuse.org/package/live_build_log/openSUSE:Factory/os-autoinst/standard/x86_64">https://build.opensuse.org/package/live_build_log/openSUSE:Factory/os-autoinst/standard/x86_64</a> passes</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Note that this is not specific to OBS/Leap15.4</li>
<li>Confirm the source of the <code>fatal: not a git repository (or any of the parent directories): .git</code> errors
<ul>
<li>This is probably not failing anything and not new</li>
</ul></li>
<li>Address or switch off the <code>Mojo::File::spurt is deprecated in favor of Mojo::File::spew</code> errors which seem to upset our checks for no warnings, e.g. just replace all uses of spurt with spew because that's how rolling distributions work. As fallback try to use a dynamic lookup of the method presence and use it if available, fallback to "spurt" otherwise</li>
<li>Also see <a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/17748">https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/17748</a> and <a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/17746">https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/17746</a></li>
</ul>
<a name="Rollback-actions"></a>
<h2 >Rollback actions<a href="#Rollback-actions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Remove perl-Mojo-IOLoop-ReadWriteProcess and perl-Mojolicious-Plugin-AssetPack from devel:openQA as soon as we have the new version in Tumbleweed and current Leap</li>
<li>Same but in devel:openQA:Leap:15.5</li>
</ul>
QA - action #133748 (Resolved): Move of openqaworker-arm-1 to FC Basement size:Mhttps://progress.opensuse.org/issues/1337482023-08-03T09:07:38Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>In #132614 openqaworker-arm-1 was moved to FC Basement so that we have one hot-redundant aarch64 OSD machine outside of PRG2. For that to be setup we need to also accomodate the automatic recovery feature.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> openqaworker-arm-1 runs OSD production jobs again</li>
<li><strong>AC2:</strong> The automatic recovery of openqaworker-arm-1 on crashes works</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Disable the automatic recovery for openqaworker-arm-1 from the old location</li>
<li>Mount the machine and connect it back into the network including DHCP/DNS in <a href="https://gitlab.suse.de/OPS-Service/salt/" class="external">https://gitlab.suse.de/OPS-Service/salt/</a></li>
<li>Remove old DHCP/DNS entries in <a href="https://gitlab.suse.de/OPS-Service/salt/" class="external">https://gitlab.suse.de/OPS-Service/salt/</a></li>
<li>Update <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls</a></li>
<li>Find on <a href="https://wiki.suse.net/index.php/SUSE-Quality_Assurance/Labs" class="external">https://wiki.suse.net/index.php/SUSE-Quality_Assurance/Labs</a> how the new PDU can be used</li>
<li>Integrate the new PDU in <a href="https://gitlab.suse.de/openqa/grafana-webhook-actions" class="external">https://gitlab.suse.de/openqa/grafana-webhook-actions</a></li>
</ul>
<a name="Rollback-steps"></a>
<h2 >Rollback steps<a href="#Rollback-steps" class="wiki-anchor">¶</a></h2>
<ul>
<li>Add back openqaworker-arm-1 to salt on OSD</li>
<li>after openqaworker-arm-1 is back remove silences in <a href="https://monitor.qa.suse.de/alerting/silences" class="external">https://monitor.qa.suse.de/alerting/silences</a></li>
<li>Remove the "Mute All times" in <a href="https://monitor.qa.suse.de/alerting/routes" class="external">https://monitor.qa.suse.de/alerting/routes</a> for <code>__contacts__ =~ .*"Trigger reboot of openqaworker-arm-1".*</code></li>
</ul>
openQA Infrastructure - action #132860 (Resolved): openqa-piworker is unstable and needs regular ...https://progress.opensuse.org/issues/1328602023-07-17T08:39:49Zosukup
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p><a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/1694765" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/1694765</a></p>
<p>only thing found in logs:<br>
salt_ping.log:</p>
<pre><code>Currently the following minions are down:
8d7
< "openqa-piworker.qa.suse.de"
===================
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> we are able to process openQA Raspberry Pi bare-metal jobs consistently over some days</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li><p>Identify the cause for regression</p>
<ul>
<li>likely something related to the hardware RTC</li>
<li>try if it just works with Leap 15.5 because we wanted to upgrade anyway</li>
<li>could be a recent kernel update so try to downgrade</li>
</ul></li>
<li><p>If it is really necessary and you exhausted all other remote-controllable options then go to the office, unplug RTC, reinstall the system assuming it was a borked system and corruption, or whatever</p></li>
<li><p>As Plan Y (if options A to X failed) buy wifi&bluetooth adapter for a IPMI controllable server and use that instead to connect to the rpi bare metal test instances</p></li>
</ul>
<a name="Rollback-steps"></a>
<h2 >Rollback steps<a href="#Rollback-steps" class="wiki-anchor">¶</a></h2>
<ul>
<li>Add back salt key with <code>ssh osd "sudo salt-key -y -a openqa-piworker.qa.suse.de"</code></li>
</ul>
QA - action #132617 (Resolved): Move of selected LSG QE machines NUE1 to PRG2e size:Mhttps://progress.opensuse.org/issues/1326172023-07-12T15:00:01Zokurzokurz@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>NUE1 needs to be emptied. For some machines we opted to have them moved to "PRG2e" aka. "PRG2 Colo Extension" or similar. Assuming nobody does the job for us we need to unrack and organize the move with Facilities and SUSE-IT.</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> All machines from <a href="https://netbox.suse.de/dcim/devices/?tag=qe-lsg&tag=move-to-prague-colo2" class="external">https://netbox.suse.de/dcim/devices/?tag=qe-lsg&tag=move-to-prague-colo2</a> are usable from PRG2e</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Ask hreinecke and SUSE-IT and wengel and others how this move is organized</li>
<li>As necessary organize transport of equipment: Create ticket over <a href="https://sd.suse.com" class="external">https://sd.suse.com</a> to component "Facilities" asking them how and where to prepare machines for move and ask them to move the equipment to FC Basement</li>
<li>As necessary: Go to NUE1 Maxtorhof place beforehand and prepare the move, e.g. nothing connected anymore, put on pallet, labeled, packed into boxes, etc.</li>
<li>Inform users about the pending move</li>
<li>Ensure machines are usable from PRG2e</li>
<li>Ensure documentation is up-to-date</li>
<li>Inform users after everything is done</li>
</ul>
<a name="Rollback-steps"></a>
<h2 >Rollback steps<a href="#Rollback-steps" class="wiki-anchor">¶</a></h2>
<p>Remove alert silence from <a href="https://monitor.qa.suse.de/alerting/silences" class="external">https://monitor.qa.suse.de/alerting/silences</a> with <code>alertname=openqaw5-xen: host up alert</code></p>
QA - action #103464 (Resolved): qa-tools-backlog-assistant: Extract code into a GitHub Action for...https://progress.opensuse.org/issues/1034642021-12-03T10:11:11Ztinitatina.mueller+trick-redmine@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Currently there are two projects originating from the same one:</p>
<ul>
<li><a href="https://github.com/os-autoinst/qa-tools-backlog-assistant" class="external">https://github.com/os-autoinst/qa-tools-backlog-assistant</a></li>
<li><a href="https://github.com/BillAnastasiadis/qe-c-backlog-assistant" class="external">https://github.com/BillAnastasiadis/qe-c-backlog-assistant</a></li>
</ul>
<p>Originally the script contained the code and configuration.<br>
Now both projects have diverged because they are tracking different backlogs.<br>
Also both projects have been refactored so that the configuration is not directly in the code anymore.</p>
<p>Other projects wanting to use this assistant have to fork it, but because of local changes they can't easily</p>
<ul>
<li>contribute back code improvements</li>
<li>pull in upstream improvements</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Extract code into a GitHub Action, so that projects using it only have to configure it locally via a YAML file.</li>
<li><a href="https://docs.github.com/en/actions/creating-actions" class="external">https://docs.github.com/en/actions/creating-actions</a></li>
<li>Configuration <em>could</em> be done via the workflow file itself (via env vars), but that may include a lot of repetition</li>
<li>A YAML file read by the code directly might be better</li>
<li>The workflow configuration currently needs to define every queue separately, see <a href="https://github.com/BillAnastasiadis/qe-c-backlog-assistant/blob/master/.github/workflows/backlog_checker.yml#L30" class="external">https://github.com/BillAnastasiadis/qe-c-backlog-assistant/blob/master/.github/workflows/backlog_checker.yml#L30</a> ff. Better might be just one workflow step.</li>
</ul>
QA - action #95822 (Resolved): qa-maintenance/openQABot failed to trigger aggregate tests with "u...https://progress.opensuse.org/issues/958222021-07-22T07:03:59Zokurzokurz@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>From <a href="https://chat.suse.de/channel/qem-openqa-review?msg=ifWGbs7QXfJdGJqTf" class="external">https://chat.suse.de/channel/qem-openqa-review?msg=ifWGbs7QXfJdGJqTf</a></p>
<p>From <code>ssh qam2 'journalctl -M openqabot -u openqabot-full --since=2021-07-22'</code>:</p>
<pre><code>Jul 22 01:09:50 openqabot oqaqambot[21718]: INFO: Updates shedule enabled for this run on PUBCLOUD12SP5AZUREStandardgen2:x86_64
Jul 22 01:09:50 openqabot oqaqambot[21718]: INFO: sle-12-SP5-x86_64 repohash: 4a870c348452ec6fb6c9ca52b30d9aea
Jul 22 01:09:50 openqabot oqaqambot[21718]: INFO: Incidents in sle-12-SP5-x86_64: {'ARCH': 'x86_64',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'BUILD': '20210722-1',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'DISTRI': 'sle',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'FLAVOR': 'AZURE-Standard-gen2-Updates',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'OS_TEST_ISSUES': '20175,20204,20222,20248,20258,20283,20344,20353,20354,20431,20434,20450,20475,20477,20485,4705',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'PUBLICCLOUD_TOOLS_IMAGE_QUERY': 'https://openqa.suse.de/group_overview/276.json',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'PUBLIC_CLOUD_AZURE_OFFER': 'sles-12-sp5',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'PUBLIC_CLOUD_AZURE_SKU': 'gen2',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'PUBLIC_CLOUD_IMAGE_ID': '',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'REPOHASH': '4a870c348452ec6fb6c9ca52b30d9aea',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'SDK_TEST_ISSUES': '20204,20222,20225,20248,20274,20283,20326,20344,20354,20434,20450,20475,20477',
Jul 22 01:09:50 openqabot oqaqambot[21718]: 'VERSION': '12-SP5',
Jul 22 01:09:50 openqabot oqaqambot[21718]: '_OBSOLETE': 1}
Jul 22 01:09:57 openqabot oqaqambot[21718]: WARNING: PUBCLOUD12SP5AZUREBasic is outdated: 20210721-1
Jul 22 01:09:57 openqabot oqaqambot[21718]: INFO: Updates shedule enabled for this run on PUBCLOUD12SP5AZUREBasic:x86_64
Jul 22 01:09:57 openqabot oqaqambot[21718]: INFO: sle-12-SP5-x86_64 repohash: 4a870c348452ec6fb6c9ca52b30d9aea
Jul 22 01:09:57 openqabot oqaqambot[21718]: INFO: Incidents in sle-12-SP5-x86_64: {'ARCH': 'x86_64',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'BUILD': '20210722-1',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'DISTRI': 'sle',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'FLAVOR': 'AZURE-Basic-Updates',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'OS_TEST_ISSUES': '20175,20204,20222,20248,20258,20283,20344,20353,20354,20431,20434,20450,20475,20477,20485,4705',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'PUBLICCLOUD_TOOLS_IMAGE_QUERY': 'https://openqa.suse.de/group_overview/276.json',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'PUBLIC_CLOUD_AZURE_OFFER': 'sles-12-sp5-basic',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'PUBLIC_CLOUD_AZURE_SKU': 'gen1',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'PUBLIC_CLOUD_IMAGE_ID': '',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'REPOHASH': '4a870c348452ec6fb6c9ca52b30d9aea',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'SDK_TEST_ISSUES': '20204,20222,20225,20248,20274,20283,20326,20344,20354,20434,20450,20475,20477',
Jul 22 01:09:57 openqabot oqaqambot[21718]: 'VERSION': '12-SP5',
Jul 22 01:09:57 openqabot oqaqambot[21718]: '_OBSOLETE': 1}
Jul 22 01:10:02 openqabot oqaqambot[21718]: Traceback (most recent call last):
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/bin/oqaqambot", line 11, in <module>
Jul 22 01:10:02 openqabot oqaqambot[21718]: load_entry_point('openQABot==0.3.0', 'console_scripts', 'oqaqambot')()
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib/python3.6/site-packages/openqabot/main.py", line 18, in main
Jul 22 01:10:02 openqabot oqaqambot[21718]: sys.exit(run_bot(logger, args, sys))
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib/python3.6/site-packages/openqabot/main.py", line 41, in run_bot
Jul 22 01:10:02 openqabot oqaqambot[21718]: return OpenQABot(metadata, args)()
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib/python3.6/site-packages/openqabot/openqabot.py", line 110, in __call__
Jul 22 01:10:02 openqabot oqaqambot[21718]: self.calculate_updates()
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib/python3.6/site-packages/openqabot/openqabot.py", line 142, in calculate_updates
Jul 22 01:10:02 openqabot oqaqambot[21718]: incidents = updates.gather_incidents(self.apiurl, arch)
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib/python3.6/site-packages/openqabot/update/updates.py", line 109, in gather_incidents
Jul 22 01:10:02 openqabot oqaqambot[21718]: req = self.is_incident_in_testing(apiurl, incident)
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib/python3.6/site-packages/openqabot/update/updates.py", line 59, in is_incident_in_testing
Jul 22 01:10:02 openqabot oqaqambot[21718]: res = osc.core.search(apiurl, request=xpath)["request"]
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib/python3.6/site-packages/osc/core.py", line 6819, in search
Jul 22 01:10:02 openqabot oqaqambot[21718]: f = http_GET(u)
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib/python3.6/site-packages/osc/core.py", line 3421, in http_GET
Jul 22 01:10:02 openqabot oqaqambot[21718]: def http_GET(*args, **kwargs): return http_request('GET', *args, **kwargs)
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib/python3.6/site-packages/osc/core.py", line 3410, in http_request
Jul 22 01:10:02 openqabot oqaqambot[21718]: fd = urlopen(req, data=data)
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib64/python3.6/urllib/request.py", line 223, in urlopen
Jul 22 01:10:02 openqabot oqaqambot[21718]: return opener.open(url, data, timeout)
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib64/python3.6/urllib/request.py", line 532, in open
Jul 22 01:10:02 openqabot oqaqambot[21718]: response = meth(req, response)
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib64/python3.6/urllib/request.py", line 642, in http_response
Jul 22 01:10:02 openqabot oqaqambot[21718]: 'http', request, response, code, msg, hdrs)
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib64/python3.6/urllib/request.py", line 570, in error
Jul 22 01:10:02 openqabot oqaqambot[21718]: return self._call_chain(*args)
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
Jul 22 01:10:02 openqabot oqaqambot[21718]: result = func(*args)
Jul 22 01:10:02 openqabot oqaqambot[21718]: File "/usr/lib64/python3.6/urllib/request.py", line 650, in http_error_default
Jul 22 01:10:02 openqabot oqaqambot[21718]: raise HTTPError(req.full_url, code, msg, hdrs, fp)
Jul 22 01:10:02 openqabot oqaqambot[21718]: urllib.error.HTTPError: HTTP Error 500: Internal Server Error
Jul 22 01:10:03 openqabot systemd[1]: openqabot-full.service: Main process exited, code=exited, status=1/FAILURE
Jul 22 01:10:03 openqabot systemd[1]: Failed to start Schedule and review Maintenance incidents in openQA full run.
</code></pre>
<a name="Problem"></a>
<h2 >Problem<a href="#Problem" class="wiki-anchor">¶</a></h2>
<p>Could be a temporary performance problem on openqa.suse.de. In any case a retry should be conducted.</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Logs on openqa.suse.de should be checked for the same timestamp (beware of timezone differences)</li>
<li>Implement retry, potentially even possible on systemd service level but that could cause lots of redundant jobs triggered if there is just a minor failure <em>after</em> many jobs had been triggered</li>
</ul>
<a name="Workaround"></a>
<h2 >Workaround<a href="#Workaround" class="wiki-anchor">¶</a></h2>
<p>On qam2: Trigger <code>systemctl -M openqabot start openqabot-full</code> manually. Caution: Takes multiple minutes, better do that in a screen session and monitor <code>journalctl -M openqabot -u openqabot-full -f</code></p>
openQA Infrastructure - action #81198 (Resolved): [tracker-ticket] openqaworker-arm-{1..3} have n...https://progress.opensuse.org/issues/811982020-12-18T13:36:54Znicksingernsinger@suse.com
<p>As we face repeated network problems with our arm workers (e.g. <a href="https://progress.opensuse.org/issues/81026" class="external">https://progress.opensuse.org/issues/81026</a>) we decided to disable ipv6 once again completely on all our arm workers.<br>
This ticket is to track this change to revisit it after the Christmas holidays</p>
openQA Infrastructure - action #65178 (Resolved): Drop rsync.pl config from salt for osd and o3https://progress.opensuse.org/issues/651782020-04-02T10:15:17Zlivdywanliv.dywan@suse.com
<p>okurz wrote:</p>
<blockquote>
<p>oops, found that we have the repo still installed and configured for both osd and o3 which we should remove before we can call this done. E.g. see <a href="https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/openqa/server.sls#L177" class="external">https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/openqa/server.sls#L177</a> . IMHO as long as we have the repo checked out and covered in salt we are not done with the cleanup.</p>
</blockquote>
openQA Infrastructure - action #44612 (Resolved): Do we want to update http://tumblesle.qa.suse.d...https://progress.opensuse.org/issues/446122018-12-01T07:30:43Zokurzokurz@suse.com
<p><a href="http://tumblesle.qa.suse.de/" class="external">http://tumblesle.qa.suse.de/</a></p>
openQA Infrastructure - action #37644 (Resolved): [tools] osd SSL certificate is only valid for o...https://progress.opensuse.org/issues/376442018-06-21T18:58:28Zokurzokurz@suse.comopenQA Infrastructure - action #19190 (Resolved): make use of ix64ph1014, e.g. for proxymodehttps://progress.opensuse.org/issues/191902017-05-17T07:11:33Zokurzokurz@suse.com
<p>[17 May 2017 08:52:46] coolo: do we still need <a href="https://openqa.suse.de/tests/latest?machine=ix64ph1014" class="external">https://openqa.suse.de/tests/latest?machine=ix64ph1014</a> ?<br>
[17 May 2017 08:54:40] okurz: well, I have no love left for this vnc thingie<br>
[17 May 2017 08:54:52] okurz: we better free this machine and use it for proxymode<br>
[17 May 2017 09:08:38] good morning<br>
[17 May 2017 09:09:53] coolo: so I will delete the machine in openQA and delete the schedule?<br>
[17 May 2017 09:10:51] okurz: leave the machine as documentation how to set this up. It might still be wanted in the future - for another machine<br>
[17 May 2017 09:11:00] but drop the job</p>
<p>okurz: I dropped the job from scheduling</p>