https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-12-09T21:37:25ZopenSUSE Project Management ToolopenQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4720362021-12-09T21:37:25Zjlausuchjalausuch@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-6 priority-5 priority-high3 closed" href="/issues/101295">action #101295</a>: [timebox: 8h][sporadic] test fails in verify_default_target</i> added</li></ul> openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4723772021-12-10T09:19:29Zmaritawernermawerner@suse.com
<ul></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/25856">@jlausuch</a> is that a ticket for the yast team? Or more for the QE Core team? Or both?</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4723802021-12-10T09:29:06Zoorlovoorlov@suse.com
<ul></ul><p>Marita, it looks like it is a ticket for qe-tools, as this is related to openQA itself. It is not something related to test code.</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4723832021-12-10T09:31:59Zjlausuchjalausuch@suse.com
<ul></ul><p>Exactly, it has nothing to do with Yast as it looks something related to openQA backend. </p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4724582021-12-10T12:37:47Zjlausuchjalausuch@suse.com
<ul></ul><p>Another example: <a href="https://openqa.suse.de/tests/7820905#step/docker_firewall/5" class="external">https://openqa.suse.de/tests/7820905#step/docker_firewall/5</a></p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4725092021-12-10T16:24:57Zmkittlermarius.kittler@suse.com
<ul></ul><p>That's a recent regression, right? The first commit that comes to my mind is <a href="https://github.com/os-autoinst/os-autoinst/commit/d5eb330962dc9f13230af29e05eea7cefebd3124" class="external">https://github.com/os-autoinst/os-autoinst/commit/d5eb330962dc9f13230af29e05eea7cefebd3124</a> as it affects failing jobs specifically.</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4725102021-12-10T16:37:45Zjlausuchjalausuch@suse.com
<ul></ul><p>mkittler wrote:</p>
<blockquote>
<p>That's a recent regression, right? The first commit that comes to my mind is <a href="https://github.com/os-autoinst/os-autoinst/commit/d5eb330962dc9f13230af29e05eea7cefebd3124" class="external">https://github.com/os-autoinst/os-autoinst/commit/d5eb330962dc9f13230af29e05eea7cefebd3124</a> as it affects failing jobs specifically.</p>
</blockquote>
<p>Yes, I've been noticing about failures like this only recently. </p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4725662021-12-11T07:41:09Zjlausuchjalausuch@suse.com
<ul></ul><p>Another failure that might be related:<br>
<a href="https://openqa.suse.de/tests/7832724#step/btrfs_send_receive/2" class="external">https://openqa.suse.de/tests/7832724#step/btrfs_send_receive/2</a><br>
Here, after <code>snapper_cleanup</code> fails, the next modules fail due to <code>Failed to wait for login prompt</code>.</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4727312021-12-13T09:24:06Zmaritawernermawerner@suse.com
<ul><li><strong>Subject</strong> changed from <i>After module failure, the console is broken</i> to <i>[qe-core] After module failure, the console is broken</i></li></ul> openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4727342021-12-13T09:24:38Zmaritawernermawerner@suse.com
<ul><li><strong>Subject</strong> changed from <i>[qe-core] After module failure, the console is broken</i> to <i>After module failure, the console is broken</i></li></ul> openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4727372021-12-13T09:27:19Zmaritawernermawerner@suse.com
<ul><li><strong>Project</strong> changed from <i>openQA Tests</i> to <i>openQA Project</i></li><li><strong>Category</strong> deleted (<del><i>Bugs in existing tests</i></del>)</li></ul> openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4732112021-12-14T11:19:08Zokurzokurz@suse.com
<ul><li><strong>Category</strong> set to <i>Regressions/Crashes</i></li><li><strong>Target version</strong> set to <i>Ready</i></li></ul> openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4739012021-12-16T10:41:54Zmkittlermarius.kittler@suse.com
<ul><li><strong>Subject</strong> changed from <i>After module failure, the console is broken</i> to <i>After module failure, the console is broken size:M</i></li><li><strong>Description</strong> updated (<a title="View differences" href="/journals/473901/diff?detail_id=448398">diff</a>)</li><li><strong>Status</strong> changed from <i>New</i> to <i>Workable</i></li></ul> openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4743722021-12-17T12:59:55Zjlausuchjalausuch@suse.com
<ul></ul><p>Another example that might be related: <a href="https://openqa.suse.de/tests/7871222#step/btrfs_qgroups/2" class="external">https://openqa.suse.de/tests/7871222#step/btrfs_qgroups/2</a></p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4803042022-01-14T15:51:19Zmkittlermarius.kittler@suse.com
<ul></ul><blockquote>
<p>After loading snapshots in os-autoinst use QEMU monitoring commands …</p>
</blockquote>
<p>Note that <a href="https://openqa.suse.de/tests/7806676#step/cifs/4" class="external">https://openqa.suse.de/tests/7806676#step/cifs/4</a> (the first job mentioned in the ticket description) is actually not using the QEMU backend. The svrit backend which is used here also supports snapshots so it could still be a performance problem when loading snapshots. However, QEMU commands won't always help.</p>
<p>Apparently, the problem can be reproduced quite reliably: <a href="https://openqa.suse.de/tests/7924937#next_previous" class="external">https://openqa.suse.de/tests/7924937#next_previous</a> - There was not even a single job in that recent history that was <em>not</em> affected.</p>
<hr>
<p>I don't understand what's the problem in the libvorbis example as there's just a single failing module.</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4853102022-01-28T23:55:12Zopenqa_reviewopenqa-review@suse.de
<ul></ul><p>This is an autogenerated message for openQA integration by the openqa_review script:</p>
<p>This bug is still referenced in a failing openQA test: minimal+role_minimal<br>
<a href="https://openqa.suse.de/tests/8029475" class="external">https://openqa.suse.de/tests/8029475</a></p>
<p>To prevent further reminder comments one of the following options should be followed:</p>
<ol>
<li>The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted</li>
<li>The openQA job group is moved to "Released" or "EOL" (End-of-Life)</li>
<li>The bugref in the openQA scenario is removed or replaced, e.g. <code>label:wontfix:boo1234</code></li>
</ol>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4854782022-01-31T08:56:39Zokurzokurz@suse.com
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li></ul><p>Treating as high due to the reminder comment so likely someone waits for this ticket to be resolved</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4855202022-01-31T10:03:20Zlivdywanliv.dywan@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/485520/diff?detail_id=459558">diff</a>)</li></ul><p>Discussed the ticket after the daily, and came up with a suggestion to visualize the snapshotting in the test module execution</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4909522022-02-15T10:23:07Zlivdywanliv.dywan@suse.com
<ul><li><strong>Status</strong> changed from <i>Workable</i> to <i>In Progress</i></li><li><strong>Assignee</strong> set to <i>livdywan</i></li></ul><p>Asked for some pointers in the daily, as I wasn't sure wether to extend on the js or Perl side of openQA, and received the suggestion to really solve it in os-autoinst without special-casing</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4912602022-02-16T04:10:32Zopenqa_reviewopenqa-review@suse.de
<ul><li><strong>Due date</strong> set to <i>2022-03-02</i></li></ul><p>Setting due date based on mean cycle time of SUSE QE Tools</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4932612022-02-21T10:10:40Zlivdywanliv.dywan@suse.com
<ul></ul><p>I prepared <a href="https://github.com/os-autoinst/os-autoinst/pull/1960" class="external">a draft against os-autoinst</a>, confirming the virtual lack of test coverage related to snapshots to start with. Next step adding my proposed fix, but also tests because I'm into TDD</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4946742022-02-23T22:25:13Zlivdywanliv.dywan@suse.com
<ul></ul><p>cdywan wrote:</p>
<blockquote>
<p>I prepared <a href="https://github.com/os-autoinst/os-autoinst/pull/1960" class="external">a draft against os-autoinst</a>, confirming the virtual lack of test coverage related to snapshots to start with. Next step adding my proposed fix, but also tests because I'm into TDD</p>
</blockquote>
<ul>
<li>I updated the original draft to implement snapshot visualization via record info files, and added autotest coverage.</li>
<li>Since I kept wading through deprecation messages I also proposed an orthogonal fix for that <a href="https://github.com/os-autoinst/os-autoinst/pull/1965" class="external">https://github.com/os-autoinst/os-autoinst/pull/1965</a></li>
<li>I prepared another branch which adds coverage for qemu-based snapshot logging as part of the fullstack test: <a href="https://github.com/os-autoinst/os-autoinst/pull/1966" class="external">https://github.com/os-autoinst/os-autoinst/pull/1966</a></li>
<li>Yet another branch addresses svirt-specific snapshot features - this is not totally obvious but there's quite a bit of backend-specific code meaning two jobs using snapshots can fail very differently if something goes wrong: <a href="https://github.com/os-autoinst/os-autoinst/pull/1967" class="external">https://github.com/os-autoinst/os-autoinst/pull/1967</a></li>
</ul>
<p>Only the first one is required to resolve this ticket, but as I mentioned I wanted to confirm gaps in coverage while I'm working on this since I had to disambiguate different issues for myself anyway and this ticket gets linked to very different jobs.</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4972662022-03-02T18:49:48Zlivdywanliv.dywan@suse.com
<ul></ul><p>cdywan wrote:</p>
<blockquote>
<p>cdywan wrote:</p>
<blockquote>
<p>I prepared <a href="https://github.com/os-autoinst/os-autoinst/pull/1960" class="external">a draft against os-autoinst</a>, confirming the virtual lack of test coverage related to snapshots to start with. Next step adding my proposed fix, but also tests because I'm into TDD</p>
</blockquote>
<ul>
<li>I updated the original draft to implement snapshot visualization via record info files, and added autotest coverage.</li>
</ul>
</blockquote>
<p>Apparently I'm hitting <a href="https://github.com/os-autoinst/os-autoinst/pull/1960#discussion_r817948236" class="external">confusing behaviors</a> where <code>$current_test</code> is not defined when the <code>lastgood</code> snapshot gets loaded.</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4977312022-03-04T08:12:44Zlivdywanliv.dywan@suse.com
<ul><li><strong>Due date</strong> changed from <i>2022-03-02</i> to <i>2022-03-11</i></li></ul><p>cdywan wrote:</p>
<blockquote>
<p>cdywan wrote:</p>
<blockquote>
<p>cdywan wrote:</p>
<blockquote>
<p>I prepared <a href="https://github.com/os-autoinst/os-autoinst/pull/1960" class="external">a draft against os-autoinst</a>, confirming the virtual lack of test coverage related to snapshots to start with. Next step adding my proposed fix, but also tests because I'm into TDD</p>
</blockquote>
<ul>
<li>I updated the original draft to implement snapshot visualization via record info files, and added autotest coverage.</li>
</ul>
</blockquote>
<p>Apparently I'm hitting <a href="https://github.com/os-autoinst/os-autoinst/pull/1960#discussion_r817948236" class="external">confusing behaviors</a> where <code>$current_test</code> is not defined when the <code>lastgood</code> snapshot gets loaded.</p>
</blockquote>
<p>Aiming to wrap this up next week, and getting some ideas from team members (to support our hackweek I didn't actively ask for help in the usual calls).</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4983882022-03-08T07:16:45Zjlausuchjalausuch@suse.com
<ul></ul><p>I don't see it too often in JeOS test, but today I noticed:<br>
<a href="https://openqa.suse.de/tests/8285438#step/btrfs_qgroups/2" class="external">https://openqa.suse.de/tests/8285438#step/btrfs_qgroups/2</a></p>
<p>After a timeout failure in btrfs_autocompletion, all the following modules fail with<br>
<code>Test died: Failed to wait for login prompt at sle/lib/serial_terminal.pm line 114.</code></p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4991892022-03-09T17:42:13Zlivdywanliv.dywan@suse.com
<ul></ul><p>New, cleaner approach to show the snapshot loading in tests <a href="https://github.com/os-autoinst/os-autoinst/pull/1987" class="external">https://github.com/os-autoinst/os-autoinst/pull/1987</a></p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4992042022-03-09T18:29:54Zlivdywanliv.dywan@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Feedback</i></li></ul><p>cdywan wrote:</p>
<blockquote>
<p>New, cleaner approach to show the snapshot loading in tests <a href="https://github.com/os-autoinst/os-autoinst/pull/1987" class="external">https://github.com/os-autoinst/os-autoinst/pull/1987</a></p>
</blockquote>
<p>Merged</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4992152022-03-09T20:47:53Zokurzokurz@suse.com
<ul></ul><p>I deployed the change on openqaworker7 and triggered a specific test job:</p>
<pre><code>openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org/t2236452 _GROUP=0 BUILD=okurz_poo103791 TEST=krypton-live-okurz_poo103791 EXCLUDE_MODULES=systemsettings5,dolphin,konsole,desktop_mainmenu,kontact,shutdown WORKER_CLASS=openqaworker7
</code></pre>
<p>this should fail in firefox (at least the job template failed), load a snapshot, show a test module result with the information that a snapshot was loaded, and should then fail again in kate and load a snapshot again but don't show a box because there is no next test module.</p>
<p>Created job #2236467: opensuse-Tumbleweed-Krypton-Live-x86_64-Build4.33-krypton-live@USBboot_64-2G -> <a href="https://openqa.opensuse.org/t2236467" class="external">https://openqa.opensuse.org/t2236467</a></p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4992242022-03-09T21:22:47Zokurzokurz@suse.com
<ul></ul><p><a href="https://openqa.opensuse.org/tests/2236467#step/firefox_audio/1" class="external">https://openqa.opensuse.org/tests/2236467#step/firefox_audio/1</a> shows the expected snapshot loading icon. Though I wonder why there is none in <a href="https://openqa.opensuse.org/tests/2236467#step/system_prepare/1" class="external">https://openqa.opensuse.org/tests/2236467#step/system_prepare/1</a> . I suggest to crosscheck the difference in algorithm for milestone and non-milestone.</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4996342022-03-10T17:23:13Zlivdywanliv.dywan@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p><a href="https://openqa.opensuse.org/tests/2236467#step/firefox_audio/1" class="external">https://openqa.opensuse.org/tests/2236467#step/firefox_audio/1</a> shows the expected snapshot loading icon. Though I wonder why there is none in <a href="https://openqa.opensuse.org/tests/2236467#step/system_prepare/1" class="external">https://openqa.opensuse.org/tests/2236467#step/system_prepare/1</a> . I suggest to crosscheck the difference in algorithm for milestone and non-milestone.</p>
</blockquote>
<p>This looks to me like (not) being in the same category makes the difference, although I can't back up that observation with code 🤔️</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=4996462022-03-10T18:44:19Zokurzokurz@suse.com
<ul><li><strong>Due date</strong> deleted (<del><i>2022-03-11</i></del>)</li><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul><p>yeah, let's just pretend we have not seen this problem and call the problem done :)</p>
<p>For everyone, please keep in mind that this will not fix specific problems in test code that still need to be fixed individually. This is only making it more clear where a snapshot was loaded with the potential impact that can have.</p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=5003992022-03-14T07:19:47Zjlausuchjalausuch@suse.com
<ul></ul><p><a href="https://openqa.suse.de/tests/8321241" class="external">https://openqa.suse.de/tests/8321241</a><br>
<a href="https://openqa.suse.de/tests/8322911" class="external">https://openqa.suse.de/tests/8322911</a><br>
<a href="https://openqa.suse.de/tests/8322915" class="external">https://openqa.suse.de/tests/8322915</a></p>
openQA Project - action #103791: After module failure, the console is broken size:Mhttps://progress.opensuse.org/issues/103791?journal_id=5005642022-03-14T12:26:03Zlivdywanliv.dywan@suse.com
<ul></ul><p>jlausuch wrote:</p>
<blockquote>
<p><a href="https://openqa.suse.de/tests/8321241" class="external">https://openqa.suse.de/tests/8321241</a><br>
<a href="https://openqa.suse.de/tests/8322911" class="external">https://openqa.suse.de/tests/8322911</a><br>
<a href="https://openqa.suse.de/tests/8322915" class="external">https://openqa.suse.de/tests/8322915</a></p>
</blockquote>
<p>All of these show the snapshot loading after <em>btrfs_autocompletion</em> failed, including sunsequent modules because <code>testapi::select_console("root-virtio-terminal")</code> gets stuck. So it looks to me like a consequence of #108064.</p>