https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-08-10T14:56:41ZopenSUSE Project Management ToolopenQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4341832021-08-10T14:56:41Zlivdywanliv.dywan@suse.com
<ul><li><strong>Subject</strong> changed from <i>imagetester automatic updates don't work</i> to <i>automatic updates on imagetester don't work and it failed to come up after reboot</i></li><li><strong>Due date</strong> set to <i>2021-08-12</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>Blocked</i></li><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li></ul><p>I'm bumping prio since the machine is completely offline right now and not just outdated. I made this clear in the title also. And <em>Blocked</em> with a <em>due date</em> on Thursday on the off chance that we forget about it.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4343762021-08-11T14:45:44Zlivdywanliv.dywan@suse.com
<ul><li><strong>Blocks</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed" href="/issues/96311">action #96311</a>: qemu error message is still "debug", should be "warn" or more severe size:S</i> added</li></ul> openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4343822021-08-11T14:48:50Zlivdywanliv.dywan@suse.com
<ul></ul><p>Seems like there's several redundant disk errors on the machine that look like this and it's currently stuck in grub:</p>
<pre><code>BTRFS error (device) md0p1): Remounting read0write after error is not allowed
</code></pre> openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4347832021-08-13T09:59:24Zlivdywanliv.dywan@suse.com
<ul><li><strong>Due date</strong> changed from <i>2021-08-12</i> to <i>2021-08-20</i></li></ul><p>cdywan wrote:</p>
<blockquote>
<p>Seems like there's several redundant disk errors on the machine that look like this and it's currently stuck in grub:</p>
<pre><code>BTRFS error (device) md0p1): Remounting read0write after error is not allowed
</code></pre></blockquote>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/21806">@osukup</a> reached out to @mrueckert to get access to the console</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4354912021-08-16T13:43:46Zlivdywanliv.dywan@suse.com
<ul><li><strong>Blocks</strong> deleted (<i><a class="issue tracker-4 status-3 priority-4 priority-default closed" href="/issues/96311">action #96311</a>: qemu error message is still "debug", should be "warn" or more severe size:S</i>)</li></ul> openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4356892021-08-17T08:04:32Zosukup
<ul></ul><p>infra sked for new disk:</p>
<blockquote>
<blockquote>
<p>but more importantly, can you organise a new disk and sent it to NUE that i can physically replace it? </p>
</blockquote>
</blockquote>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4357252021-08-17T08:45:57Zokurzokurz@suse.com
<ul></ul><p>well, I guess we can get replacement hardware. We could wait for nsinger or try to go ahead ourselves and for example ask runger for help. I suggest to wait for nsinger to return from vacation and order together with him, ship to Nbg office and let EngInfra install the replacement hardware.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4378642021-08-24T08:53:19Znicksingernsinger@suse.com
<ul></ul><p>a new disk sounds feasible. Just wondering: did we make sure the disk is actually broken? Or is it "just" the filesystem on there?<br>
Given that @mrueckert was involved I could imagine he checked but just to be sure :)</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4379092021-08-24T11:13:24Zosukup
<ul></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/24624">@nicksinger</a> -> marcus ruecket isnt involved, record in rackspace is obsolete. Ticket is handled by <a class="user active user-mention" href="https://progress.opensuse.org/users/75">@maxmaher</a> (Maximilian Maher), please contact him </p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4382542021-08-25T12:22:56Znicksingernsinger@suse.com
<ul></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/21806">@osukup</a> could you please share this ticket? The ticket number, something?</p>
<p>I talked with gschlotter today to get access to the ipmi interface. Unfortunately it is an infra-only subnet we can't access. I currently still don't know what hardware we have in there…</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4382662021-08-25T12:44:10Zosukup
<ul></ul><p>you mean <a href="https://infra.nue.suse.com/SelfService/Display.html?id=194271" class="external">https://infra.nue.suse.com/SelfService/Display.html?id=194271</a> from description </p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4382752021-08-25T12:50:40Znicksingernsinger@suse.com
<ul><li><strong>Assignee</strong> changed from <i>osukup</i> to <i>nicksinger</i></li></ul> openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4382842021-08-25T13:13:40Znicksingernsinger@suse.com
<ul></ul><p>Right, overlooked it. I've updated the ticket and talked to Max. I need IPMI access to the machine to continue further:</p>
<pre><code>As discussed in RC it would be nice if somebody could reconfigure the switch so that we have access to the IPMI interface. This way I could 1) figure out if the HDD is actually broken or just the FS and 2) what is currently build in and what we need to buy. Since Max will be on vacation next week and might be to busy this week with other tasks it would be nice if somebody else from infra could take the reconfiguration of the switch.
</code></pre>
<p>I set this to blocked until this happened.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4383262021-08-25T17:06:55Zlivdywanliv.dywan@suse.com
<ul><li><strong>Due date</strong> changed from <i>2021-08-20</i> to <i>2021-09-10</i></li></ul><p>Moving <em>due date</em> as per conversation in chat since we're waiting on other people and it's not considered super urgent.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4456232021-09-14T08:52:40Zokurzokurz@suse.com
<ul><li><strong>Due date</strong> changed from <i>2021-09-10</i> to <i>2021-09-17</i></li><li><strong>Status</strong> changed from <i>Blocked</i> to <i>Feedback</i></li></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/24624">@nicksinger</a> the infra ticket was resolved on 2021-09-07, so did you check if you do have IPMI access or something?</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4461032021-09-15T08:53:10Znicksingernsinger@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Blocked</i></li></ul><p>unfortunately I didn't receive notifications. The ticket was closed with "please open a jira SD ticket". Done so now: <a href="https://sd.suse.com/servicedesk/customer/portal/1/SD-60360" class="external">https://sd.suse.com/servicedesk/customer/portal/1/SD-60360</a> (can anybody see this besides me?)</p>
<p>Nothing else happend.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4462652021-09-15T13:27:45Zlivdywanliv.dywan@suse.com
<ul></ul><p>nicksinger wrote:</p>
<blockquote>
<p>unfortunately I didn't receive notifications. The ticket was closed with "please open a jira SD ticket". Done so now: <a href="https://sd.suse.com/servicedesk/customer/portal/1/SD-60360" class="external">https://sd.suse.com/servicedesk/customer/portal/1/SD-60360</a> (can anybody see this besides me?)</p>
</blockquote>
<p>I can't. Did you CC or otherwise add the ml or idnividual team members?</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4474622021-09-20T09:28:47Znicksingernsinger@suse.com
<ul></ul><p>Unfortunately I can only "share" the tickets with real accounts and not e-mails (like MLs). I added you, oli and marius for now manually.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4474832021-09-20T09:57:50Zlivdywanliv.dywan@suse.com
<ul><li><strong>Due date</strong> changed from <i>2021-09-17</i> to <i>2021-09-22</i></li></ul><p>Thanks! I can see it now. Bumping the due date to Wednesday for now.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4483982021-09-22T15:06:54Zokurzokurz@suse.com
<ul><li><strong>Project</strong> changed from <i>openQA Project</i> to <i>openQA Infrastructure</i></li><li><strong>Subject</strong> changed from <i>automatic updates on imagetester don't work and it failed to come up after reboot</i> to <i>recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)</i></li><li><strong>Due date</strong> changed from <i>2021-09-22</i> to <i>2021-10-01</i></li><li><strong>Category</strong> deleted (<del><i>Regressions/Crashes</i></del>)</li><li><strong>Status</strong> changed from <i>Blocked</i> to <i>Feedback</i></li></ul><p>I could actually access the machine now over IPMI (see SD ticket). I created <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/358" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/358</a></p>
<p>The SD ticket is still open and I asked for a proper DNS entry for IPMI. However, with the current state it should be possible to proceed hence "unblocking". </p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4492982021-09-24T13:37:34Znicksingernsinger@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Workable</i></li></ul><p>we could assign a *.qa.suse.de domain if nothing happens in the ticket.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4506992021-09-29T09:13:53Zlivdywanliv.dywan@suse.com
<ul></ul><p>Discussed it briefly in the unblock. We might use the IP or *.qam domain. Most importantly Nick will try and see how to restore the machine, maybe an office visit on Thursday</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4510382021-09-30T07:19:36Znicksingernsinger@suse.com
<ul></ul><p>I tired to access the machine over IPMI but apparently the Console redirection is misconfigured in the BIOS. Access over IPMIViewer is also not possible. So I need to check the machine in person today. Hopefully I catch somebody from infra who can let me into srv1.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4513532021-10-01T06:50:31Zokurzokurz@suse.com
<ul><li><strong>Due date</strong> changed from <i>2021-10-01</i> to <i>2021-10-05</i></li></ul><p>To be checked on-site at next possibility</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4516712021-10-03T17:55:10Zokurzokurz@suse.com
<ul></ul><p>Regarding a good name for the IPMI endpoint maxmaher created <a href="https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/1960" class="external">https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/1960</a> , not merged yet. It might be good for us to remember that repo so that at any next time we can create MRs ourselves</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4521692021-10-05T09:02:09Zokurzokurz@suse.com
<ul><li><strong>Due date</strong> changed from <i>2021-10-05</i> to <i>2021-10-22</i></li><li><strong>Priority</strong> changed from <i>High</i> to <i>Normal</i></li></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/24624">@nicksinger</a> as discussed in daily as you could not find anyone in office to collaborate on please create EngInfra ticket stating what goal we want to reach, e.g. check with IPMI command that we can read what's going on in the serial terminal, and suggest to change UEFI settings or something "on-site"</p>
<p>setting much further due-date as we rely a lot on individuals present on-site in nbg office and imagetester turned out to be not that critical right now, especially as we found already workarounds how to run openQA workers in containers (for s390x but we could apply the same elsewhere when we need to). Please still act urgently on raising the request with EngInfra, then we wait</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4522412021-10-05T11:27:35Znicksingernsinger@suse.com
<ul><li><strong>Status</strong> changed from <i>Workable</i> to <i>In Progress</i></li></ul><p>Alright, after talking to <a class="user active user-mention" href="https://progress.opensuse.org/users/22072">@mkittler</a> today after the javaws stuff I had another idea how to access the iKVM of that machine. I went to the web interface of the BMC at <a href="http://10.160.65.195">http://10.160.65.195</a> and clicked on the "Remote Console Preview" image there. This offers you to download a <code>launch.jnlp</code> file. Executing it on the console unfortunately fails with:</p>
<pre><code>nsinger@workstation ~/Downloads » LANG=C javaws launch.jnlp
selected jre: /etc/java-config-2/current-system-vm/jre/
Warning!, Fall back in resolve_jar to hardcoded paths:
no
selected jre: /etc/java-config-2/current-system-vm/jre/
Warning!, Fall back in resolve_jar to hardcoded paths:
no
You are trying to get resource https://10.160.65.195:443/iKVM.jar but it is not in cache and could not be downloaded. Attempting to continue, but you may expect failure
You are trying to get resource https://10.160.65.195:443/liblinux_x86_64.jar but it is not in cache and could not be downloaded. Attempting to continue, but you may expect failure
JAR https://10.160.65.195:443/iKVM.jar not found. Continuing.
JAR https://10.160.65.195:443/liblinux_x86_64.jar not found. Continuing.
JAR https://10.160.65.195:443/iKVM.jar not found. Continuing.
JAR https://10.160.65.195:443/liblinux_x86_64.jar not found. Continuing.
netx: Initialization Error: Could not initialize application. (Fatal: Initialization Error: Unknown Main-Class. Could not determine the main class for this application.)
net.sourceforge.jnlp.LaunchException: Fatal: Initialization Error: Could not initialize application. The application has not been initialized, for more information execute javaws from the command line.
at net.sourceforge.jnlp.Launcher.createApplication(Launcher.java:822)
at net.sourceforge.jnlp.Launcher.launchApplication(Launcher.java:531)
at net.sourceforge.jnlp.Launcher$TgThread.run(Launcher.java:945)
Caused by: net.sourceforge.jnlp.LaunchException: Fatal: Initialization Error: Unknown Main-Class. Could not determine the main class for this application.
at net.sourceforge.jnlp.runtime.JNLPClassLoader.initializeResources(JNLPClassLoader.java:774)
at net.sourceforge.jnlp.runtime.JNLPClassLoader.<init>(JNLPClassLoader.java:338)
at net.sourceforge.jnlp.runtime.JNLPClassLoader.createInstance(JNLPClassLoader.java:421)
at net.sourceforge.jnlp.runtime.JNLPClassLoader.getInstance(JNLPClassLoader.java:495)
at net.sourceforge.jnlp.runtime.JNLPClassLoader.getInstance(JNLPClassLoader.java:468)
at net.sourceforge.jnlp.Launcher.createApplication(Launcher.java:814)
... 2 more
</code></pre>
<p>Following the leads in the error message I watched at an strace of the same "Remote console" but from openqaworker8. I grabbed its <code>liblinux_x86_64.jar</code> and <code>iKVM.jar</code> in the hopes that it might somehow work. It didn't. But the local webserver showed me what files imagetesters BMC actually requested:</p>
<pre><code>2021-10-05 13:04:33.991 [INFO ] [::ffff:127.0.0.1]:35580 - HEAD /liblinux_x86_64__V1.0.3.jar.pack.gz (local: ./liblinux_x86_64__V1.0.3.jar.pack.gz)
2021-10-05 13:04:33.991 [INFO ] [::ffff:127.0.0.1]:35582 - HEAD /iKVM__V1.69.13.0x0.jar.pack.gz (local: ./iKVM__V1.69.13.0x0.jar.pack.gz)
</code></pre>
<p>With this information I was finally able to request the original files of from imagetester and put them into a temporary directory:</p>
<pre><code>curl -k https://10.160.65.195:443/liblinux_x86_64__V1.0.3.jar.pack.gz > /tmp/ikvm/liblinux_x86_64__V1.0.3.jar.pack.gz
curl -k https://10.160.65.195:443/iKVM__V1.69.13.0x0.jar.pack.gz > /tmp/ikvm/iKVM__V1.69.13.0x0.jar.pack.gz
</code></pre>
<p>I then started a webserver serving these files and modified the <code>launch.jnlp</code>s first line from</p>
<pre><code><jnlp spec="1.0+" codebase="https://10.160.65.195:443/">
</code></pre>
<p>to:</p>
<pre><code><jnlp spec="1.0+" codebase="http://127.0.0.1:8888/">
</code></pre>
<p>Now I can start this modified jnlp-file with <code>javaws launch.jnlp</code> and it opens up a java application where I'm able to see the graphical output of imagetester. I can now start the investigation and we're not blocked by infra anymore.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4522952021-10-05T13:28:50Znicksingernsinger@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Feedback</i></li></ul><p>I booted a systemrescuecd and scrubbed the btrfs filesystem on the disk: no issues reported. Smart values look fine but error reporting doesn't seem to be supported so I couldn't check if there was any error reported in the past.<br>
I then chrooted into the system (following <a href="https://wiki.gentoo.org/wiki/Chroot/en#Configuration):" class="external">https://wiki.gentoo.org/wiki/Chroot/en#Configuration):</a></p>
<pre><code>chroot /mnt/mychroot
mount -a
transactional-update shell
zypper ref
zypper up
</code></pre>
<p>but it failed with:</p>
<pre><code>Download (curl) error for 'http://download.opensuse.org/repositories/devel:/openQA/openSUSE_Leap_15.2/x86_64/os-autoinst-4.6.1633339717.7d37d2ac-lp152.875.1.x86_64.rpm':
Error code: Curl error 60
Error message: SSL certificate problem: self signed certificate in certificate chain
</code></pre>
<p>(all previous 227 packages where fine and no such error came up). I then tried to reboot the machine into the live-system again and it came back up. Not sure if the scrub fixed it or if transactional-upgrade did some magic to make it work again. On the live system I gave it another try with <code>transactional-update up</code> which successfully updated all installed packages. I then did another reboot and the machine came up successfully without any failing services in systemd. <code>transactional-update.timer</code> is also running too so lets see if the machine now works again.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4522982021-10-05T13:33:45Znicksingernsinger@suse.com
<ul></ul><p>unfortunately I never got SOL to work. I checked the BIOS and everything looked fine regarding console redirection. I also tried to change values from (COM3* to COM2 and COM1) and several "IRQ" serial configurations. According to the docs of the BIOS the * at "COM3*" means this should be the console for SOL. I even got key presses redirected to the bios over ipmitool but nothing comes back in my local terminal. Also echo'ing stuff inside linux to the according /dev/ttyS* devices yields absolutely no output on my local ipmitool session. Given that we never even had BMC access before, I consider this "good enough" as I found a workaround for accessing the screen over this quirky <a href="https://progress.opensuse.org/issues/96719#note-27" class="external">java "hack" mentioned above</a>.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4526912021-10-06T09:35:27Zlivdywanliv.dywan@suse.com
<ul></ul><p>Please reboot it once more to ensure reboot stability, and then we can assume it "works".</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4527002021-10-06T09:41:30Zokurzokurz@suse.com
<ul></ul><p>Nicely done. Impressive story and well done. I mentioned the note also in <a href="https://progress.opensuse.org/projects/openqav3/wiki/Wiki/diff?utf8=%E2%9C%93&version=128&version_from=127&commit=View+differences" class="external">https://progress.opensuse.org/projects/openqav3/wiki/Wiki/diff?utf8=%E2%9C%93&version=128&version_from=127&commit=View+differences</a> . Could you add a reference to our salt pillars file which also references o3 hosts?</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4527242021-10-06T10:42:07Znicksingernsinger@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul><p>Did a <code>transactional-update dup reboot</code>, updates where successful and a reboot got scheduled:</p>
<pre><code>imagetester:~ # rebootmgrctl is-active
RebootMgr is active
imagetester:~ # rebootmgrctl get-strategy
Reboot strategy: best-effort
imagetester:~ # rebootmgrctl get-window
Maintenance window is set to *-*-* 03:30:00, lasting 01h30m.
imagetester:~ # rebootmgrctl status
Status: Reboot requested, waiting for maintenance window
</code></pre>
<p>I canceled that reboot and triggered an immediate one: <code>rebootmgrctl reboot now</code>. After a few seconds the machine was back up and running.</p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=4527632021-10-06T11:32:36Znicksingernsinger@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>Nicely done. Impressive story and well done. I mentioned the note also in <a href="https://progress.opensuse.org/projects/openqav3/wiki/Wiki/diff?utf8=%E2%9C%93&version=128&version_from=127&commit=View+differences" class="external">https://progress.opensuse.org/projects/openqav3/wiki/Wiki/diff?utf8=%E2%9C%93&version=128&version_from=127&commit=View+differences</a> . Could you add a reference to our salt pillars file which also references o3 hosts?</p>
</blockquote>
<p>Thanks! I made a more extensive wiki entry containing these information and referenced it with <a href="https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/359" class="external">https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/359</a></p>
openQA Infrastructure - action #96719: recover imagetester with broken filesystem/hardware (was: automatic updates on imagetester don't work and it failed to come up after reboot)https://progress.opensuse.org/issues/96719?journal_id=6681682023-09-04T11:45:39Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed" href="/issues/135137">action #135137</a>: Bring back imagetester size:M</i> added</li></ul>