https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842020-12-14T07:42:38ZopenSUSE Project Management ToolopenQA Infrastructure - action #81020: QA-Power8-4-kvm start failed since reboot on 2020-12-13https://progress.opensuse.org/issues/81020?journal_id=3579302020-12-14T07:42:38Znicksingernsinger@suse.com
<ul><li><strong>Assignee</strong> set to <i>nicksinger</i></li></ul> openQA Infrastructure - action #81020: QA-Power8-4-kvm start failed since reboot on 2020-12-13https://progress.opensuse.org/issues/81020?journal_id=3579322020-12-14T07:42:58Znicksingernsinger@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul> openQA Infrastructure - action #81020: QA-Power8-4-kvm start failed since reboot on 2020-12-13https://progress.opensuse.org/issues/81020?journal_id=3580162020-12-14T08:35:14Znicksingernsinger@suse.com
<ul></ul><p>I can successfully recover the machine with:</p>
<pre><code>kexec -l /var/petitboot/mnt/dev/sdb2/boot/vmlinux-5.3.18-lp152.57-default --initrd=/var/petitboot/mnt/dev/sdb2/boot/initrd-5.3.18-lp152.57-default --command-line="root=UUID=eebe647f-e867-416e-a0fa-7a6732bfcf9d nospec kvm.nested=1 kvm_intel.nested=1 kvm_amd.nested=1 kvm-arm.nested=1 crashkernel=210M"
kexec -e
</code></pre>
<p>So petitboot can find the disk but refuses to load the bootloader entries from there. We're once again at the point where we would need to understand petitboot.<br>
Petitboot on that machine is from 2016. I couldn't figure out yet how one can update it. Most likely with a "firmware upgrade" from IBM.<br>
I will try to rewrite grub (configs) now (despite nothing changed there according to zypper logs) and see if that helps.</p>
<p>If we can't figure out what is causing all the power machines to lose their bootloader entries we might need to consider to remove auto reboots for now… Currently we're keeping ourself busy with recovering these hosts</p>
openQA Infrastructure - action #81020: QA-Power8-4-kvm start failed since reboot on 2020-12-13https://progress.opensuse.org/issues/81020?journal_id=3580442020-12-14T09:44:02Zokurzokurz@suse.com
<ul></ul><p>nicksinger wrote:</p>
<blockquote>
<p>If we can't figure out what is causing all the power machines to lose their bootloader entries we might need to consider to remove auto reboots for now… Currently we're keeping ourself busy with recovering these hosts</p>
</blockquote>
<p>Agreed. Feel welcome to just call <code>systemctl mask auto-update.service</code> on all affected machines for now (and revert before closing this ticket or any that still mentions that). I could not <code>test.ping</code> the machine nor login over ssh to qa-power8-4-kvm.qa right now to do it myself and did not want to kick anyone out of SoL.</p>
openQA Infrastructure - action #81020: QA-Power8-4-kvm start failed since reboot on 2020-12-13https://progress.opensuse.org/issues/81020?journal_id=3580802020-12-14T12:24:18Znicksingernsinger@suse.com
<ul><li><strong>Assignee</strong> changed from <i>nicksinger</i> to <i>okurz</i></li></ul><p>okurz wrote:</p>
<blockquote>
<p>nicksinger wrote:</p>
<blockquote>
<p>If we can't figure out what is causing all the power machines to lose their bootloader entries we might need to consider to remove auto reboots for now… Currently we're keeping ourself busy with recovering these hosts</p>
</blockquote>
<p>Agreed. Feel welcome to just call <code>systemctl mask auto-update.service</code> on all affected machines for now (and revert before closing this ticket or any that still mentions that). I could not <code>test.ping</code> the machine nor login over ssh to qa-power8-4-kvm.qa right now to do it myself and did not want to kick anyone out of SoL.</p>
</blockquote>
<p>yeah I tried to regenerate the grub config as this helped last time. But since Power8-5 showed exactly the same symptoms (stuck in petitboot, detecting network boot but not the installed OS) I start to think we face a product bug. Unfortunately I can't seem to figure out in what component :/<br>
The auto-update.service seems to be masked already on Power8-4 as well as Power8-5. Isn't rebootmgr the service causing the reboots?</p>
openQA Infrastructure - action #81020: QA-Power8-4-kvm start failed since reboot on 2020-12-13https://progress.opensuse.org/issues/81020?journal_id=3581242020-12-14T22:04:06Zokurzokurz@suse.com
<ul><li><strong>Assignee</strong> changed from <i>okurz</i> to <i>nicksinger</i></li><li><strong>Target version</strong> set to <i>Ready</i></li></ul><p>nicksinger wrote:</p>
<blockquote>
<p>[…]<br>
The auto-update.service seems to be masked already on Power8-4 as well as Power8-5. Isn't rebootmgr the service causing the reboots?</p>
</blockquote>
<p>yes, of course. stupid me. Assuming you just assigned to me to answer the question I am assigning back to you. If you want me to do something else then you can assign it back but please tell me then what I should do :)</p>
openQA Infrastructure - action #81020: QA-Power8-4-kvm start failed since reboot on 2020-12-13https://progress.opensuse.org/issues/81020?journal_id=3581742020-12-15T03:49:06Znicksingernsinger@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed behind-schedule" href="/issues/81058">action #81058</a>: [tracker-ticket] Power machines can't find installed OS. Automatic reboots disabled for now</i> added</li></ul> openQA Infrastructure - action #81020: QA-Power8-4-kvm start failed since reboot on 2020-12-13https://progress.opensuse.org/issues/81020?journal_id=3581782020-12-15T03:51:56Znicksingernsinger@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul><p>Resolving this in favor of the bigger tracking ticket regarding ppc boot problems. Also I think this covers more the immediate actions taken to get the host back up and running. Feel free to reopen if you disagree.</p>