https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842020-10-24T11:51:52ZopenSUSE Project Management ToolopenQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3426162020-10-24T11:51:52Zokurzokurz@suse.com
<ul><li><strong>Subject</strong> changed from <i>Upgrade osd workers to openSUSE Leap 15.2</i> to <i>Upgrade osd workers and other machines, e.g. monitoring, to openSUSE Leap 15.2</i></li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3426282020-10-24T11:54:26Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Workable</i></li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3469422020-11-04T20:39:38Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Workable</i> to <i>Blocked</i></li><li><strong>Assignee</strong> set to <i>okurz</i></li></ul><p>let's wait for the corresponding o3 ticket first</p>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3553042020-11-27T15:53:11Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Blocked</i> to <i>Workable</i></li><li><strong>Assignee</strong> deleted (<del><i>okurz</i></del>)</li></ul><p>o3 is good, this can be followed</p>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3560222020-12-02T12:08:54Zokurzokurz@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/356022/diff?detail_id=353306">diff</a>)</li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3572502020-12-09T13:43:09Zlivdywanliv.dywan@suse.com
<ul><li><strong>Assignee</strong> set to <i>livdywan</i></li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3573282020-12-09T16:37:17Zlivdywanliv.dywan@suse.com
<ul><li><strong>Subject</strong> changed from <i>Upgrade osd workers and other machines, e.g. monitoring, to openSUSE Leap 15.2</i> to <i>Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2</i></li><li><strong>Description</strong> updated (<a title="View differences" href="/journals/357328/diff?detail_id=354492">diff</a>)</li></ul><p>For reference:</p>
<ul>
<li>I'm using <code>sudo salt -C 'G@roles:worker' cmd.run 'grep VERSION= /etc/os-release'</code> to check what workers need to be upgraded</li>
<li>Also <code>monitor.qa.suse.de</code></li>
</ul>
<p>That leaves only openqa.suse.de which is covered by <a class="issue tracker-4 status-3 priority-5 priority-high3 closed child" title="action: Upgrade osd webUI host to openSUSE Leap 15.2 (Resolved)" href="https://progress.opensuse.org/issues/75244">#75244</a></p>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3573342020-12-09T19:39:42Zlivdywanliv.dywan@suse.com
<ul><li><strong>Status</strong> changed from <i>Workable</i> to <i>In Progress</i></li></ul><ul>
<li>DONE openqaworker2
<ul>
<li>installed screen</li>
</ul></li>
<li>DONE openqaworker5
<ul>
<li>Got stuck installing <code>dpdk-kmp-default</code>, good after second zypper run</li>
<li><code>var-lib-openqa-share.mount loaded failed failed /var/lib/openqa/share</code> after reboot</li>
<li>Ran <code>sudo systemctl restart var-lib-openqa-share.mount</code></li>
</ul></li>
<li>DONE openqa-monitor
<ul>
<li>zypper succeeded on the second attempt (refreshes are racy I guess)</li>
</ul></li>
<li>DONE openqaworker6</li>
<li>DONE openqaworker8
<ul>
<li>zypper upgrade went fine.</li>
<li><code>var-lib-openqa-share.mount loaded failed failed /var/lib/openqa/share</code> after reboot</li>
<li>Ran <code>sudo systemctl restart var-lib-openqa-share.mount</code></li>
</ul></li>
</ul>
<p>DONE implies I checked that workers show up on <a href="https://openqa.suse.de/admin/workers" class="external">https://openqa.suse.de/admin/workers</a> and picked up jobs</p>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3573362020-12-09T22:03:20Zlivdywanliv.dywan@suse.com
<ul></ul><ul>
<li>DONE openqaworker9</li>
<li>DONE openqaworker10
<ul>
<li>Had to run <code>sudo systemctl restart var-lib-openqa-share.mount</code></li>
<li>installed <code>htop</code></li>
<li>not online, no jobs picked up yet</li>
<li><code>systemctl restart openqa-worker@{1..10}</code> to remedy <a class="issue tracker-4 status-3 priority-4 priority-default closed child" title="action: Worker is stuck in "broken" state due to unavailable cache service (was: and even continuously fa... (Resolved)" href="https://progress.opensuse.org/issues/78390">#78390</a></li>
</ul></li>
<li>DONE openqaworker13
<ul>
<li>Had to run <code>sudo systemctl restart var-lib-openqa-share.mount</code> here as well</li>
</ul></li>
<li>DONE QA-Power8-5-kvm.qa.suse.de
<ul>
<li><code>connection refused</code> after reboot, stuck in petitboot</li>
<li><code>kexec -l /var/petitboot/mnt/dev/sda2/boot/vmlinux --initrd=/var/petitboot/mnt/dev/sda2/boot/initrd --append="root=UUID=eebe647f-e867-416e-a0fa-7a6732bfcf9d console=tty0 console=ttyS1,115200 nospec" && kexec -e</code> made it as far as dracut</li>
<li>Tried again, this time with the right device ID (but not me, so no log of the command)</li>
</ul></li>
<li>DONE QA-Power8-4-kvm.qa.suse.de
<ul>
<li><code>connection refused</code> after reboot, stuck in petitboot, <code>kexec load failed</code></li>
<li><code>kexec -l /var/petitboot/mnt/dev/sdb2/boot/vmlinux --initrd=/var/petitboot/mnt/dev/sdb2/boot/initrd --append="root=UUID=eebe647f-e867-416e-a0fa-7a6732bfcf9d console=tty0 console=ttyS1,115200 nospec" && kexec -e</code> resulted in a successful boot</li>
<li>Installed <code>htop</code></li>
<li><code>kdump.service loaded failed failed Load kdump kernel and initrd</code> after reboot</li>
<li><a class="issue tracker-4 status-3 priority-4 priority-default closed behind-schedule" title="action: Check failed services on our workers (Resolved)" href="https://progress.opensuse.org/issues/56588#note-9">#56588#note-9</a> talks about disabling kdump - still I had to re-disable it via <code>sudo systemctl disable --now kdump && sudo systemctl reset-failed</code></li>
<li>Online, not picking up jobs <em>yet</em></li>
</ul></li>
<li>DONE grenache-1
<ul>
<li>not online, no jobs picked up yet</li>
<li>see <a class="issue tracker-4 status-3 priority-4 priority-default closed child" title="action: Worker is stuck in "broken" state due to unavailable cache service (was: and even continuously fa... (Resolved)" href="https://progress.opensuse.org/issues/78390">#78390</a></li>
</ul></li>
</ul>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3575042020-12-10T09:24:44Zlivdywanliv.dywan@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed child" href="/issues/78390">action #78390</a>: Worker is stuck in "broken" state due to unavailable cache service (was: and even continuously fails to (re)connect to some configured web UIs)</i> added</li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3575942020-12-10T13:11:20Zlivdywanliv.dywan@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Feedback</i></li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3580582020-12-14T10:17:26Zlivdywanliv.dywan@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>In Progress</i></li></ul><p>cdywan wrote:</p>
<blockquote>
<p>For reference:</p>
<ul>
<li>I'm using <code>sudo salt -C 'G@roles:worker' cmd.run 'grep VERSION= /etc/os-release'</code> to check what workers need to be upgraded</li>
<li>Also <code>monitor.qa.suse.de</code></li>
</ul>
<p>That leaves only openqa.suse.de which is covered by <a class="issue tracker-4 status-3 priority-5 priority-high3 closed child" title="action: Upgrade osd webUI host to openSUSE Leap 15.2 (Resolved)" href="https://progress.opensuse.org/issues/75244">#75244</a></p>
</blockquote>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/32669">@Xiaojing_liu</a> made me aware that I missed <code>malbec.arch.suse.de</code>, <code>openqaworker-arm-1.suse.de</code> and <code>openqaworker-arm-2.suse.de</code>, probably due to machines being down 🙄</p>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3581202020-12-14T19:34:14Zlivdywanliv.dywan@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Feedback</i></li></ul><ul>
<li><p>WIP malbec.arch.suse.de</p>
<ul>
<li>Stuck in petitboot after reboot</li>
<li><code>PXE autoconfiguration failed</code></li>
<li><em>netboot</em> fails with <code>load_kernel: /tmp/pb-2eSo7I is not a 64bit PowerPC executable</code></li>
<li>None of the entries mentioned in <a class="issue tracker-4 status-3 priority-3 priority-lowest closed" title="action: OSD deployment failed at 2020-12-02 because 'malbec.arch.suse.de' is down (Resolved)" href="https://progress.opensuse.org/issues/80656#note-9">#80656#note-9</a> are visible.</li>
<li>Booted via a new entry with <code>/boot/vmlinux</code> and <code>/boot/initrd</code> on <em>sdb1</em> with <code>nomodeset console=hvc console=tty</code>.</li>
</ul>
<p>[FAILED] Failed to mount /var/lib/openqa/share.<br>
[FAILED] Failed to start Load kdump kernel and initrd.<br>
systemctl disable --now kdump && sudo systemctl reset-failed</p>
<ul>
<li>Mounting /var/lib/openqa/share looks to have succeeded afterall.</li>
<li>Worker is registered</li>
</ul></li>
<li><p>WIP openqaworker-arm-2.suse.de</p>
<ul>
<li>was ready to reboot</li>
<li>got unresponsive and was rebooted (by someone else?)</li>
</ul></li>
<li><p>WIP openqaworker-arm-1.suse.de ready to reboot</p></li>
</ul>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3581382020-12-15T00:14:33Zlivdywanliv.dywan@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed" href="/issues/81046">action #81046</a>: openqaworker-arm-2.suse.de unreachable</i> added</li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3582722020-12-15T10:24:48Zlivdywanliv.dywan@suse.com
<ul></ul><ul>
<li>DONE malbec.arch.suse.de afterall</li>
<li>DONE openqaworker-arm-2.suse.de
<ul>
<li>See <a class="issue tracker-4 status-3 priority-5 priority-high3 closed" title="action: openqaworker-arm-2.suse.de unreachable (Resolved)" href="https://progress.opensuse.org/issues/81046">#81046</a></li>
</ul></li>
<li>DONE openqaworker-arm-1.suse.de</li>
</ul>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3584902020-12-16T09:48:38Zlivdywanliv.dywan@suse.com
<ul></ul><p>Still open:</p>
<ul>
<li><code>openqaworker-arm-3.suse.de</code>, see <a class="issue tracker-4 status-3 priority-5 priority-high3 closed" title="action: [osd-admins][alert] Failed systemd services alert (workers): os-autoinst-openvswitch.service (and... (Resolved)" href="https://progress.opensuse.org/issues/75016">#75016</a></li>
<li><code>powerqaworker-qam-1</code>, see <a class="issue tracker-4 status-3 priority-4 priority-default closed" title="action: powerqaworker-qam-1 fails to come up on reboot (repeatedly) (Resolved)" href="https://progress.opensuse.org/issues/68053">#68053</a></li>
</ul>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3584942020-12-16T09:49:20Zlivdywanliv.dywan@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed" href="/issues/68053">action #68053</a>: powerqaworker-qam-1 fails to come up on reboot (repeatedly)</i> added</li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3585002020-12-16T09:49:29Zlivdywanliv.dywan@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed" href="/issues/75016">action #75016</a>: [osd-admins][alert] Failed systemd services alert (workers): os-autoinst-openvswitch.service (and var-lib-openqa-share.mount) on openqaworker-arm-2 and others</i> added</li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3585482020-12-16T11:01:15Zlivdywanliv.dywan@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Blocked</i></li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3588862020-12-18T09:33:45Zlivdywanliv.dywan@suse.com
<ul></ul><ul>
<li>DONE <code>openqaworker-arm-3.suse.de</code>
<ul>
<li>Rebooted while there were no jobs running</li>
</ul></li>
</ul>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3589182020-12-18T11:59:55Zlivdywanliv.dywan@suse.com
<ul><li><strong>Status</strong> changed from <i>Blocked</i> to <i>In Progress</i></li></ul><ul>
<li>DONE <code>powerqaworker-qam-1.qa.suse.de</code></li>
</ul>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3589422020-12-18T14:50:33Zlivdywanliv.dywan@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Feedback</i></li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3590062020-12-20T20:22:04Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul><p>ssh osd "sudo salt '*' cmd.run 'grep VERSION /etc/os-release'" returns 15.2 for all machines that are currently in salt :) staging machines are left as an exercise to the next users :D Do you agree to set this to Resolved?</p>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=3594122020-12-22T13:35:36Zlivdywanliv.dywan@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>ssh osd "sudo salt '*' cmd.run 'grep VERSION /etc/os-release'" returns 15.2 for all machines that are currently in salt :) staging machines are left as an exercise to the next users :D Do you agree to set this to Resolved?</p>
</blockquote>
<p>Ack. I wouldn't consider <em>staging</em> as part of <em>osd</em> and this ticket for that matter. Although I might just sort those out when nobody's looking, I practically remember the steps by heart now 😂</p>
openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=4491062021-09-24T11:11:17Zokurzokurz@suse.com
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed child" href="/issues/99192">action #99192</a>: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.3 size:M</i> added</li></ul> openQA Infrastructure - action #75238: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.2https://progress.opensuse.org/issues/75238?journal_id=7813902024-03-26T12:02:47Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" href="/issues/158041">action #158041</a>: grenache needs upgrade to 15.5</i> added</li></ul>