https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842018-03-06T11:56:14ZopenSUSE Project Management ToolopenQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=983682018-03-06T11:56:14Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Is duplicate of</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" href="/issues/31543">action #31543</a>: [sles][functional][tools][s390x][ipmi][hard][sporadic] test incompletes - "DIE The console isn't responding correctly. Maybe half-open socket?"</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=983772018-03-06T12:04:42Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Is duplicate of</strong> deleted (<i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" href="/issues/31543">action #31543</a>: [sles][functional][tools][s390x][ipmi][hard][sporadic] test incompletes - "DIE The console isn't responding correctly. Maybe half-open socket?"</i>)</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=983862018-03-06T12:04:56Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Has duplicate</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" href="/issues/31543">action #31543</a>: [sles][functional][tools][s390x][ipmi][hard][sporadic] test incompletes - "DIE The console isn't responding correctly. Maybe half-open socket?"</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=983922018-03-06T12:05:38Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Rejected</i></li><li><strong>Assignee</strong> deleted (<del><i>szarate</i></del>)</li></ul><p>reject because duplicate of <a href="https://progress.opensuse.org/issues/31543" class="external">https://progress.opensuse.org/issues/31543</a></p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1001652018-03-12T10:33:49Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Subject</strong> changed from <i>[tools] Incomplete job because console isn't responding correctly.</i> to <i>[sles][functional][tools][ipmi] Incomplete job because console isn't responding correctly. Half-open socket on IPMI</i></li><li><strong>Category</strong> set to <i>Bugs in existing tests</i></li><li><strong>Status</strong> changed from <i>Rejected</i> to <i>Workable</i></li><li><strong>Target version</strong> set to <i>Milestone 15</i></li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1001712018-03-12T10:33:59Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Has duplicate</strong> deleted (<i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" href="/issues/31543">action #31543</a>: [sles][functional][tools][s390x][ipmi][hard][sporadic] test incompletes - "DIE The console isn't responding correctly. Maybe half-open socket?"</i>)</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1002852018-03-13T03:34:17Zxlaixlai@suse.com
<ul><li><strong>Category</strong> changed from <i>Bugs in existing tests</i> to <i>Infrastructure</i></li><li><strong>Assignee</strong> set to <i>szarate</i></li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1002972018-03-13T06:24:18Zxlaixlai@suse.com
<ul><li><strong>Category</strong> changed from <i>Infrastructure</i> to <i>Bugs in existing tests</i></li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1003872018-03-13T08:55:59Zmitiao
<ul><li><strong>Assignee</strong> changed from <i>szarate</i> to <i>mitiao</i></li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1005402018-03-13T10:19:16Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Due date</strong> set to <i>2018-03-27</i></li></ul><p>planned for next sprint</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1020012018-03-14T09:52:36Znicksingernsinger@suse.com
<ul><li><strong>Subject</strong> changed from <i>[sles][functional][tools][ipmi] Incomplete job because console isn't responding correctly. Half-open socket on IPMI</i> to <i>[sles][functional][tools][ipmi][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMI</i></li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1052082018-03-26T03:13:07Zmitiao
<ul></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/22250">@xlai</a>, have you set this var:<br>
_CHKSEL_RATE_WAIT_TIME=120<br>
It may not solve the issue completely, but append it to your test to see if it will reduce the frequency of isse</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1053252018-03-26T09:44:38Zxlaixlai@suse.com
<ul><li><strong>Status</strong> changed from <i>Workable</i> to <i>In Progress</i></li></ul><p>mitiao wrote:</p>
<blockquote>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/22250">@xlai</a>, have you set this var:<br>
_CHKSEL_RATE_WAIT_TIME=120<br>
It may not solve the issue completely, but append it to your test to see if it will reduce the frequency of isse</p>
</blockquote>
<p>Will try. Thanks for the advice.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1055832018-03-27T09:04:18Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Due date</strong> changed from <i>2018-03-27</i> to <i>2018-04-24</i></li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1103622018-04-09T03:00:30Zxlaixlai@suse.com
<ul></ul><p>xlai wrote:</p>
<blockquote>
<p>mitiao wrote:</p>
<blockquote>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/22250">@xlai</a>, have you set this var:<br>
_CHKSEL_RATE_WAIT_TIME=120<br>
It may not solve the issue completely, but append it to your test to see if it will reduce the frequency of isse</p>
</blockquote>
<p>Will try. Thanks for the advice.</p>
</blockquote>
<p>With the parameter, the issue happen again on latest build 550.2, </p>
<p>Job list(total 3, happen ratio 3/34, nearly10%):<br>
<a href="https://openqa.suse.de/tests/1599762#" class="external">https://openqa.suse.de/tests/1599762#</a><br>
<a href="https://openqa.suse.de/tests/1599951/file/autoinst-log.txt" class="external">https://openqa.suse.de/tests/1599951/file/autoinst-log.txt</a><br>
<a href="https://openqa.suse.de/tests/1599952/file/autoinst-log.txt" class="external">https://openqa.suse.de/tests/1599952/file/autoinst-log.txt</a></p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1114032018-04-10T13:54:53Zokurzokurz@suse.com
<ul><li><strong>Subject</strong> changed from <i>[sles][functional][tools][ipmi][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMI</i> to <i>[sle][functional][u][tools][ipmi][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMI</i></li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1117062018-04-11T03:21:54Zxlaixlai@suse.com
<ul></ul><p>xlai wrote:</p>
<blockquote>
<p>xlai wrote:</p>
<blockquote>
<p>mitiao wrote:</p>
<blockquote>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/22250">@xlai</a>, have you set this var:<br>
_CHKSEL_RATE_WAIT_TIME=120<br>
It may not solve the issue completely, but append it to your test to see if it will reduce the frequency of isse</p>
</blockquote>
<p>Will try. Thanks for the advice.</p>
</blockquote>
<p>With the parameter, the issue happen again on latest build 550.2, </p>
<p>Job list(total 3, happen ratio 3/34, nearly10%):<br>
<a href="https://openqa.suse.de/tests/1599762#" class="external">https://openqa.suse.de/tests/1599762#</a><br>
<a href="https://openqa.suse.de/tests/1599951/file/autoinst-log.txt" class="external">https://openqa.suse.de/tests/1599951/file/autoinst-log.txt</a><br>
<a href="https://openqa.suse.de/tests/1599952/file/autoinst-log.txt" class="external">https://openqa.suse.de/tests/1599952/file/autoinst-log.txt</a></p>
</blockquote>
<p>On build 555.1, the happen ratio douled -- 6 cases incomplete due to this issue, happen ratio is 20% now.</p>
<p>Really appreciate your efforts on it. Look forward to the fix! Thanks.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1117122018-04-11T04:44:58Zokurzokurz@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/111712/diff?detail_id=111574">diff</a>)</li></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/21934">@mitiao</a> do you have an idea what to do next?</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1118892018-04-11T10:17:09Zmitiao
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/21934">@mitiao</a> do you have an idea what to do next?</p>
</blockquote>
<p>No idea yet, currently i am working on other stuffs.<br>
<a class="user active user-mention" href="https://progress.opensuse.org/users/24986">@dasantiago</a> may give some help.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1130352018-04-17T02:23:56Zxlaixlai@suse.com
<ul></ul><p>On SLE15 RC3 build 567.1 , four tests failed by this issue, happen ratio is 4/34=11%.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1149042018-04-24T08:46:07Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Due date</strong> changed from <i>2018-04-24</i> to <i>2018-05-08</i></li><li><strong>Target version</strong> changed from <i>Milestone 15</i> to <i>Milestone 16</i></li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1149642018-04-24T08:53:52ZSLindoMansillaslindomansilla@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-6 priority-4 priority-default closed" href="/issues/31375">action #31375</a>: [sle][functional][ipmi][u][hard] test fails in first_boot - VNC installation on SLE 15 failed because of various issues (ipmi worker, first_boot, boot_from_pxe, await_install)</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1149702018-04-24T08:54:20ZSLindoMansillaslindomansilla@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed child" href="/issues/32089">action #32089</a>: [sle][functional][u][ipmi][easy] test fails in first_boot - abort the test early so that we at least test the installation</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1182932018-05-08T08:53:59Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Due date</strong> changed from <i>2018-05-08</i> to <i>2018-05-22</i></li></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/21934">@mitiao</a>: could you please give us an update on the state here?<br>
Are you working on this actively? Do you need any help by the QSF team?</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1183512018-05-08T09:48:39Zmitiao
<ul></ul><p>mgriessmeier wrote:</p>
<blockquote>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/21934">@mitiao</a>: could you please give us an update on the state here?<br>
Are you working on this actively? Do you need any help by the QSF team?</p>
</blockquote>
<p>No update yet, i put this in my schedule later.<br>
Any help welcome and if anyone have idea or able to fix it, please take it :)</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1209432018-05-22T09:02:13Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Subject</strong> changed from <i>[sle][functional][u][tools][ipmi][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMI</i> to <i>[sle][tools][ipmi][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMI</i></li><li><strong>Due date</strong> deleted (<del><i>2018-05-22</i></del>)</li></ul><p>We didn't see this for a longer time now on jobs which are covered by the QSF team.<br>
so unassigning from our backlog for now.<br>
If you need any further assistance here, feel free to ask</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1209792018-05-22T10:11:03Zxlaixlai@suse.com
<ul></ul><p>mgriessmeier wrote:</p>
<blockquote>
<p>We didn't see this for a longer time now on jobs which are covered by the QSF team.<br>
so unassigning from our backlog for now.<br>
If you need any further assistance here, feel free to ask</p>
</blockquote>
<p>Virtualization job group still keeps meeting the issue, eg build 635.1, see <a href="https://openqa.suse.de/tests/1711157/file/autoinst-log.txt" class="external">https://openqa.suse.de/tests/1711157/file/autoinst-log.txt</a>.</p>
<p>This issue has been marked a serious blocking openqa backend issue for virtualization job group, since we meet a lot and blocked a lot by this issue.</p>
<p>Hope this can be fixed ASAP.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1263642018-06-11T12:17:56Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed" href="/issues/37087">action #37087</a>: [kernel][s390x] test incompletes in shutdown_ltp: half-open socket?</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1264602018-06-12T06:16:36Zmgriessmeiermgriessmeier@suse.com
<ul></ul><p>I might have an idea here... let's see...</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1265832018-06-12T09:22:27Zmgriessmeiermgriessmeier@suse.com
<ul></ul><p>I'm not able to get an ipmi job running on my openQA instance...</p>
<p>if anyone wants to pick up the idea - My guess is that we are missing a disable_vnc_stall for the ipmi backend before we reboot, so I've modified <code>prepare_system_shutdown</code> in lib/utils:</p>
<pre><code>@@ -239,7 +239,7 @@ sub prepare_system_shutdown {
# kill the ssh connection before triggering reboot
console('root-ssh')->kill_ssh if check_var('BACKEND', 'ipmi');
- if (check_var('ARCH', 's390x')) {
+ if (check_var('ARCH', 's390x') || check_var('BACKEND', 'ipmi')) {
if (check_var('BACKEND', 's390x')) {
# kill serial ssh connection (if it exists)
eval { console('iucvconn')->kill_ssh unless get_var('BOOT_EXISTING_S390', ''); };
}
console('installation')->disable_vnc_stalls;
</code></pre> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1266072018-06-12T10:15:04Zcachencachen@suse.com
<ul></ul><p>mgriessmeier wrote:</p>
<blockquote>
<p>I'm not able to get an ipmi job running on my openQA instance...</p>
<p>if anyone wants to pick up the idea - My guess is that we are missing a disable_vnc_stall for the ipmi backend before we reboot, so I've modified <code>prepare_system_shutdown</code> in lib/utils:</p>
<pre><code>@@ -239,7 +239,7 @@ sub prepare_system_shutdown {
# kill the ssh connection before triggering reboot
console('root-ssh')->kill_ssh if check_var('BACKEND', 'ipmi');
- if (check_var('ARCH', 's390x')) {
+ if (check_var('ARCH', 's390x') || check_var('BACKEND', 'ipmi')) {
if (check_var('BACKEND', 's390x')) {
# kill serial ssh connection (if it exists)
eval { console('iucvconn')->kill_ssh unless get_var('BOOT_EXISTING_S390', ''); };
}
console('installation')->disable_vnc_stalls;
</code></pre></blockquote>
<p>Nice, thank you for the solution idea!</p>
<p>@Alice, <a class="user active user-mention" href="https://progress.opensuse.org/users/21934">@mitiao</a>, any idea? can we pick up this fix and try at least in Beijing ipmi server firstly? If the fix not harm testing, then maybe we can have the fix merged and try in openqa.suse.de, what do you think?</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1270272018-06-13T02:17:51Zxlaixlai@suse.com
<ul></ul><p>cachen wrote:</p>
<blockquote>
<p>mgriessmeier wrote:</p>
<blockquote>
<p>I'm not able to get an ipmi job running on my openQA instance...</p>
<p>if anyone wants to pick up the idea - My guess is that we are missing a disable_vnc_stall for the ipmi backend before we reboot, so I've modified <code>prepare_system_shutdown</code> in lib/utils:</p>
<pre><code>@@ -239,7 +239,7 @@ sub prepare_system_shutdown {
# kill the ssh connection before triggering reboot
console('root-ssh')->kill_ssh if check_var('BACKEND', 'ipmi');
- if (check_var('ARCH', 's390x')) {
+ if (check_var('ARCH', 's390x') || check_var('BACKEND', 'ipmi')) {
if (check_var('BACKEND', 's390x')) {
# kill serial ssh connection (if it exists)
eval { console('iucvconn')->kill_ssh unless get_var('BOOT_EXISTING_S390', ''); };
}
console('installation')->disable_vnc_stalls;
</code></pre></blockquote>
<p>Nice, thank you for the solution idea!</p>
<p>@Alice, <a class="user active user-mention" href="https://progress.opensuse.org/users/21934">@mitiao</a>, any idea? can we pick up this fix and try at least in Beijing ipmi server firstly? If the fix not harm testing, then maybe we can have the fix merged and try in openqa.suse.de, what do you think?</p>
</blockquote>
<p>Thanks to mattias for the suggestion!<br>
I will try locally. It is just that our situation is more complex here. We not only has this step called but also other parts of code that handles reboot, so we may need to extend this idea. I will try locally first to see if it can work well. Then we push to openqa to see if it can make less incomplete jobs.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1279512018-06-15T09:56:06Zxlaixlai@suse.com
<ul></ul><p>PR is proposed in <a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5236" class="external">https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5236</a>.</p>
<p>This change will not affect workflow. Let's see if it can kill incomplete job due to half open socket.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1306212018-06-19T07:30:47Zmgriessmeiermgriessmeier@suse.com
<ul><li><strong>Blocks</strong> <i><a class="issue tracker-4 status-1 priority-4 priority-default" href="/issues/34471">action #34471</a>: [qe-core][functional][opensuse][medium] too early matching in too generic needle text-login-20160812</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1309452018-06-20T02:14:29Zxlaixlai@suse.com
<ul></ul><p>xlai wrote:</p>
<blockquote>
<p>PR is proposed in <a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5236" class="external">https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5236</a>.</p>
<p>This change will not affect workflow. Let's see if it can kill incomplete job due to half open socket.</p>
</blockquote>
<p>From the PR's comment by zaoliang, it reproduced easily on his test with the changes, so it was proved to be not working. </p>
<p>Now it is open for new suggestions or bug fix on the backend side.</p>
<p>Please comment back if disagree with the conclusion about this suggestion.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1309662018-06-20T06:11:35Zszarate
<ul><li><strong>Assignee</strong> changed from <i>mitiao</i> to <i>szarate</i></li></ul><p>I will pick it up, will follow up with Zaoliang to see if I can give a hand on this.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1337332018-07-02T01:55:41Zxlaixlai@suse.com
<ul></ul><p>This issue still reproduces a lot in sle12sp4 openqa testing. </p>
<p>Failed jobs examples:<br>
<a href="http://openqa.suse.de/tests/1795850" class="external">http://openqa.suse.de/tests/1795850</a><br>
<a href="http://openqa.suse.de/tests/1795947" class="external">http://openqa.suse.de/tests/1795947</a></p>
<p>Both failed at reboot_after_installation step which is the final common os installation step, rather than real virtualization specific code.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1472212018-09-05T11:06:25Zcachencachen@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed" href="/issues/40544">action #40544</a>: [OpenQA][IPMI backend] IPMI worker can not survive reboot on dell SUT</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1477162018-09-10T07:56:46Zcachencachen@suse.com
<ul><li><strong>Assignee</strong> changed from <i>szarate</i> to <i>jerrytang</i></li></ul><p>Hello Santi,<br>
PR#1021 & PR#5722 can reduce the frequency of half-open issue but doesn't fix the root cause in ipmi backend. As the issue impact ipmi base test stability a lot, here let me add Jerry to support this ticket. Hope we can get it fix ASAP.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1482202018-09-11T08:47:29Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-6 priority-4 priority-default closed" href="/issues/40655">action #40655</a>: [tools][ipmi] DIE The console isn't responding correctly. Maybe half-open socket? at /usr/lib/os-autoinst/backend/baseclass.pm line 241</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1490332018-09-13T02:39:18Zjerrytangjtang@suse.com
<ul></ul><p>update to this issue:<br>
broken ssh sock-connection by shutdown will trigger the half open check .</p>
<p>In virtualization test :<br>
select_console root-ssh will create and add ssh sock to monitor code , reboot without disconnect will cause half-open issue.</p>
<p>so this can be fix by use prepare_system_shutdown before every reboot/shutdown step in the testcase .</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1501672018-09-19T07:04:03Zjerrytangjtang@suse.com
<ul></ul><p>I submit 2 PR for fixing .</p>
<p>It will be better if developer can review and give some comment about this </p>
<p>PR for test :<br>
<a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5778" class="external">https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5778</a></p>
<p>PR for backend :<br>
<a href="https://github.com/os-autoinst/os-autoinst/pull/1026" class="external">https://github.com/os-autoinst/os-autoinst/pull/1026</a></p>
<p>Thanks </p>
<p>Jerry tang</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1513372018-09-24T12:54:52Znicksingernsinger@suse.com
<ul><li><strong>Subject</strong> changed from <i>[sle][tools][ipmi][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMI</i> to <i>[sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMI</i></li></ul><p>Changing the tag because this doesn't only affect ipmi jobs but rather <u>all</u> remote backends. </p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1513402018-09-24T12:59:00Znicksingernsinger@suse.com
<ul></ul><p>jerrytang wrote:</p>
<blockquote>
<p>update to this issue:<br>
broken ssh sock-connection by shutdown will trigger the half open check .</p>
<p>In virtualization test :<br>
select_console root-ssh will create and add ssh sock to monitor code , reboot without disconnect will cause half-open issue.</p>
<p>so this can be fix by use prepare_system_shutdown before every reboot/shutdown step in the testcase .</p>
</blockquote>
<p>Jerry, I don't understand your argumentation here while following your code-change on the same time. If I understand you correctly, calling <code>prepare_system_shutdown</code> before every reboot fixes this. However, your change doesn't call <code>prepare_system_shutdown</code> but changes much much more in the backend and test code. So simple question:</p>
<ol>
<li>Does calling <code>prepare_system_shutdown</code> before reboot fix this issue? </li>
</ol>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1515052018-09-25T02:33:34Zjerrytangjtang@suse.com
<ul></ul><blockquote>
<p>Jerry, I don't understand your argumentation here while following your code-change on the same time. If I understand you correctly, calling <code>prepare_system_shutdown</code> before every reboot fixes this. However, your change doesn't call <code>prepare_system_shutdown</code> but changes much much more in the backend and test code. So simple question:</p>
<ol>
<li>Does calling <code>prepare_system_shutdown</code> before reboot fix this issue?</li>
</ol>
</blockquote>
<p>Theoretically ,calling <code>prepare_system_shutdown</code> before reboot fix this issue .</p>
<p>During fix testcase side i found reboot is special for this situation .<br>
The currently scheme :<br>
1.reboot require ssh-connected(root-ssh console: xvnc+xterm+ssh)。</p>
<ol>
<li>prepare_system_shutdown will disconnect ssh.</li>
</ol>
<p>so you can see the problem.<br>
call prepare_system_shutdown you never get chance to reboot .</p>
<p>my pr is just one way to handle this , better way is welcome .</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1534672018-10-02T12:31:04Zmichalnowakmnowak@suse.com
<ul><li><strong>Related to</strong> deleted (<i><a class="issue tracker-4 status-6 priority-4 priority-default closed" href="/issues/40655">action #40655</a>: [tools][ipmi] DIE The console isn't responding correctly. Maybe half-open socket? at /usr/lib/os-autoinst/backend/baseclass.pm line 241</i>)</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1534732018-10-02T12:31:09Zmichalnowakmnowak@suse.com
<ul><li><strong>Has duplicate</strong> <i><a class="issue tracker-4 status-6 priority-4 priority-default closed" href="/issues/40655">action #40655</a>: [tools][ipmi] DIE The console isn't responding correctly. Maybe half-open socket? at /usr/lib/os-autoinst/backend/baseclass.pm line 241</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1534792018-10-02T12:39:06Zmichalnowakmnowak@suse.com
<ul></ul><p>Did you try <code>disable_vnc_stalls</code> on the active console before restart is triggered? I used to have problems with VNC stalls on the svirt backend, this helped. See: <a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/ecf740a0999cc8cb29c93880b7d43b080933be3c/lib/power_action_utils.pm#L47" class="external">https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/ecf740a0999cc8cb29c93880b7d43b080933be3c/lib/power_action_utils.pm#L47</a>. Do you use <code>power_action()</code>? <code>disable_vnc_stalls</code> is used from there.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1544902018-10-08T11:06:28Zjerrytangjtang@suse.com
<ul></ul><p>michalnowak wrote:</p>
<blockquote>
<p>Did you try <code>disable_vnc_stalls</code> on the active console before restart is triggered? I used to have problems with VNC stalls on the svirt backend, this helped. See: <a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/ecf740a0999cc8cb29c93880b7d43b080933be3c/lib/power_action_utils.pm#L47" class="external">https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/ecf740a0999cc8cb29c93880b7d43b080933be3c/lib/power_action_utils.pm#L47</a>. Do you use <code>power_action()</code>? <code>disable_vnc_stalls</code> is used from there.</p>
</blockquote>
<p>could you please explain how this works?<br>
because I'm not sure vnc sock is add to the monitor socks ($self->{select}).</p>
<p>And this issue is not reproduce 100%.<br>
( I think it's because of the race condition between<br><br>
1 send_key 'alt-o' means shutdown host kill ssh<br>
2 power_action('reboot', observe => 1, keepconsole => 1, first_reboot => 1); means remove and kill ssh .<br>
.<br>
if 2 faster 1, then it's fine<br>
) </p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1550842018-10-10T07:12:27Zokurzokurz@suse.com
<ul></ul><p><a href="https://openqa.suse.de/tests/2095884/file/autoinst-log.txt" class="external">https://openqa.suse.de/tests/2095884/file/autoinst-log.txt</a> seems to be one of the last examples, just to reference a more recent job :)</p>
<p>I think you missed to mention the important step which IMHO can destroy everything:</p>
<p><a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/commit/ae9cf2e2c51d24ae3505fa5fedcdb8b3528d1707" class="external">https://github.com/os-autoinst/os-autoinst-distri-opensuse/commit/ae9cf2e2c51d24ae3505fa5fedcdb8b3528d1707</a> introduced a <code>wait_screen_change</code> around the <code>alt-o</code> keypress. This relies on the remote VNC connection which might be not there already at this time. I guess removing the <code>wait_screen_change</code> can actually fix the problem -> <a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5915" class="external">https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5915</a></p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1551082018-10-10T07:42:20Zcachencachen@suse.com
<ul></ul><p>Thanks all the above suggestions and solutions!</p>
<p>We tried disable_vnc_stalls by calling prepare_system_shutdown for ipmi backend, which is not 100% fix the issue.</p>
<p>Will try solution of removing wait_screen_change.</p>
<p>I feel we are very close to get this issue fix :)</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1552372018-10-10T09:27:11Zjerrytangjtang@suse.com
<ul></ul><a name="to-prove-the-race-condition-i-create-a-job-intend-to-get-half-open-issue"></a>
<h2 >to prove the race condition , i create a job intend to get half-open issue:<a href="#to-prove-the-race-condition-i-create-a-job-intend-to-get-half-open-issue" class="wiki-anchor">¶</a></h2>
<pre><code> #after ipmi host boot
use_ssh_serial_console;
type_string("reboot\n");
sleep 4; #===========>this will make sure reboot kill ssh first
save_screenshot;
power_action('reboot', observe => 1, keepconsole => 1, first_reboot => 1);
save_screenshot;
</code></pre>
<hr>
<p>I run 4 times , and all of them get half-open problem .<br>
<a href="http://10.67.132.86/tests/309#next_previous" class="external">http://10.67.132.86/tests/309#next_previous</a></p>
<p>Also i replace type_string("reboot\n"); ====> type_string("shutdown -r 1\n"); to avoid half-open.<br>
and result is no half-open issue.</p>
<p>problem happened after send-key alt-o , your PR may not work as expect.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1554772018-10-10T15:51:06Zokurzokurz@suse.com
<ul></ul><p>So let me try to phrase in my own words: Pressing 'alt-o' causes the SUT to reboot which will close the socket from one side. This is not a problem if the corresponding ssh connection is killed fast enough but not otherwise, hence the race condition. However, we can not simply call <code>prepare_system_shutdown</code> which would also be triggered by <code>power_action</code> because we still need a connection to the SUT to send the 'alt-o' key. <a href="https://github.com/os-autoinst/os-autoinst/pull/902" class="external">https://github.com/os-autoinst/os-autoinst/pull/902</a> which was done to detect "half-open sockets" might have introduced exactly that problem. IMHO the backend needs to handle this gracefully, e.g. we press 'alt-o' and then just call <code>power_action</code> but I do not know in which way exactly this could work.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1554862018-10-11T02:13:08Zjerrytangjtang@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>So let me try to phrase in my own words: Pressing 'alt-o' causes the SUT to reboot which will close the socket from one side. This is not a problem if the corresponding ssh connection is killed fast enough but not otherwise, hence the race condition. However, we can not simply call <code>prepare_system_shutdown</code> which would also be triggered by <code>power_action</code> because we still need a connection to the SUT to send the 'alt-o' key. <a href="https://github.com/os-autoinst/os-autoinst/pull/902" class="external">https://github.com/os-autoinst/os-autoinst/pull/902</a> which was done to detect "half-open sockets" might have introduced exactly that problem. IMHO the backend needs to handle this gracefully, e.g. we press 'alt-o' and then just call <code>power_action</code> but I do not know in which way exactly this could work.</p>
</blockquote>
<p>exactly.<br>
As installation session has </p>
<ul>
<li>no io cache policy .</li>
<li>no simple way to reboot in ssh session.
so you can see my PR is use 2nd way to hard_reset.</li>
</ul>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1557322018-10-11T18:17:49Zokurzokurz@suse.com
<ul><li><strong>Target version</strong> deleted (<del><i>Milestone 16</i></del>)</li></ul><p>M16 is closed for long.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1559332018-10-12T07:06:20Zcachencachen@suse.com
<ul></ul><p>The issue was still happened time to time during RC2 build 0421 acceptance testing, we have to staring the tests and prepare to retrigger the tests failed by this issue. Personally I don't like to pushing too much, but it affect the acceptance result deliver so much, and more ipmi relevant tests will be added :(</p>
<p>Let me try to understand the current situation from comments and PRs: <br>
first all I think we are clear of why and when Half-open happens in installation->reboot_after_installation step,finally in same page :)<br>
So far there are 2 options:<br>
1)To just reduce the error hitting by Olive's PR#5915 or by prepare_system_shutdown<br>
2)Fix it as Jerry's PRs by Roughly reboot the ipmi server as hard_reset (perhaps to have server reboot directly in power_action for ipmi specify is prefer?)</p>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/15">@coolo</a>, you are PO of openQA and expert/author of ipmi backend, we need your suggestion and decision, or maybe you have better solution :)</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1559422018-10-12T07:51:12Zokurzokurz@suse.com
<ul></ul><p>cachen wrote:</p>
<blockquote>
<p>So far there are 2 options:<br>
1)To just reduce the error hitting by Olive's PR#5915 or by prepare_system_shutdown</p>
</blockquote>
<p>I think this option can be applied regardless. It might just help with the symptoms but still help :)</p>
<blockquote>
<p>2)Fix it as Jerry's PRs by Roughly reboot the ipmi server as hard_reset (perhaps to have server reboot directly in power_action for ipmi specify is prefer?)</p>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/15">@coolo</a>, you are PO of openQA and expert/author of ipmi backend, we need your suggestion and decision, or maybe you have better solution :)</p>
</blockquote>
<p>I would still go with <a class="issue tracker-4 status-3 priority-4 priority-default closed" title="action: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Ha... (Resolved)" href="https://progress.opensuse.org/issues/32746#note-53">#32746#note-53</a> which is a different approach: Handle the uni-directional socket termination gracefully in the backend.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1562062018-10-12T10:22:25Zjerrytangjtang@suse.com
<ul></ul><p>Anyway I update my pr follow some point of coolo mentioned , move all action in the prepare_system_shutdown function;<br>
<a href="https://github.com/os-autoinst/os-autoinst/pull/1026#issuecomment-429277855" class="external">https://github.com/os-autoinst/os-autoinst/pull/1026#issuecomment-429277855</a></p>
<p>But , still need backend api supported in my "NOT" graceful way.</p>
<p>I hope this issue can be fixed soon , waiting for graceful way;</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1562182018-10-12T11:10:22Zokurzokurz@suse.com
<ul></ul><p>jerrytang wrote:</p>
<blockquote>
<p>I hope this issue can be fixed soon , waiting for graceful way;</p>
</blockquote>
<p>Don't wait, better try to fix it yourself. I made a suggestion in<br>
<a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5778#pullrequestreview-164196772" class="external">https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5778#pullrequestreview-164196772</a></p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1563682018-10-12T13:39:22Zokurzokurz@suse.com
<ul></ul><p><a href="https://github.com/os-autoinst/os-autoinst/pull/902" class="external">https://github.com/os-autoinst/os-autoinst/pull/902</a> merged, <a href="https://openqa.opensuse.org/tests/772175" class="external">https://openqa.opensuse.org/tests/772175</a> is VR on normal qemu-x86_64, <a href="https://openqa.suse.de/tests/2170109#" class="external">sle-15-SP1-Installer-DVD-x86_64-Build66.2-gi-guest_sles12sp2-on-host-developing-kvm@64bit-ipmi</a> shows that at least the IPMI virtualization tests are also not broken now :)</p>
<p>Please crosscheck if this helps to mitigate the problem, e.g. with a good statistical analysis, monitor jobs triggered after that, etc.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1568212018-10-15T06:04:07Zcachencachen@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed behind-schedule" href="/issues/41330">action #41330</a>: [functional][y][s390x][investigation][timebox:4h] test fails in welcome - half-open socket in post_fail_hook causing incomplete job</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1591522018-10-22T07:13:16Zcachencachen@suse.com
<ul><li><strong>Assignee</strong> deleted (<del><i>jerrytang</i></del>)</li></ul><p>So far the error hitting has reduced in virtualization tests by the merged PO#5915, since the rest fix should touch the deep in openQA backend, I agree with Jerry to hand over it back to openQA Tools group. The discussions in below PRs can be followed.</p>
<p><a href="https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5953" class="external">https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5953</a><br>
<a href="https://github.com/os-autoinst/os-autoinst/pull/1041" class="external">https://github.com/os-autoinst/os-autoinst/pull/1041</a></p>
<p>Thanks to all who involved in this ticket for your discussions and your solutions.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1593292018-10-22T18:56:23Zokurzokurz@suse.com
<ul></ul><p>Hi cachen,</p>
<p>cachen wrote:</p>
<blockquote>
<p>So far the error hitting has reduced in virtualization tests by the merged PO#5915</p>
</blockquote>
<p>I am happy to hear that my PR help a bit :)</p>
<blockquote>
<p>since the rest fix should touch the deep in openQA backend, I agree with Jerry to hand over it back to openQA Tools group […]</p>
</blockquote>
<p>Hm, I am not sure this will work. The current <a href="https://wiki.microfocus.net/index.php?title=OpenQA#Team" class="external">tools team</a> does not really have experts in the aforementioned domain and I think Jerry was already on a good track to fix it for good.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1593502018-10-23T02:48:39Zcachencachen@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>Hi cachen,</p>
<p>cachen wrote:</p>
<blockquote>
<p>So far the error hitting has reduced in virtualization tests by the merged PO#5915</p>
</blockquote>
<p>I am happy to hear that my PR help a bit :)</p>
</blockquote>
<p>I appreciated so much ;)</p>
<blockquote>
<blockquote>
<p>since the rest fix should touch the deep in openQA backend, I agree with Jerry to hand over it back to openQA Tools group […]</p>
</blockquote>
<p>Hm, I am not sure this will work. The current <a href="https://wiki.microfocus.net/index.php?title=OpenQA#Team" class="external">tools team</a> does not really have experts in the aforementioned domain and I think Jerry was already on a good track to fix it for good.</p>
</blockquote>
<p>Unfortunately, I have to have Jerry back to Performance testing.</p>
<p>mitiao(Wei Jiang) know the whole background, he has been involved in discussion and testing a lot in Beijing office :)<br>
I think he can take over the rest thing to optimize the codes, of course Jerry and Virtualization group will continue to support and verify for his code.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1647652018-11-14T07:45:10Zokurzokurz@suse.com
<ul></ul><p>I understood from xlai that the issue is still present even though less likely. According to xlai mitiao is on it</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1749352018-12-19T09:05:11Zcachencachen@suse.com
<ul><li><strong>Assignee</strong> set to <i>mitiao</i></li><li><strong>Priority</strong> changed from <i>High</i> to <i>Normal</i></li></ul><p>Let me try assign the ticket to mitiao, since there is no respond for more than 1 month.<br>
From my understand, QA-VT still expect the ipmi issue be 100% fixed on tool backend, but the priority can be lower down(change to 'normal') since Oli's workaround helped the issue less to happen.<br>
@Alice, correct me if I am wrong.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1761052019-01-03T06:56:37Zmitiao
<ul><li><strong>Assignee</strong> changed from <i>mitiao</i> to <i>xlai</i></li></ul><p>re-assign to alice since i am leaving...</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1761412019-01-03T07:05:53Zcachencachen@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Blocked</i></li><li><strong>Assignee</strong> deleted (<del><i>xlai</i></del>)</li></ul><p>Sorry, Alice isn't member of Tools group and she doesn't responsible for backend, let me remove the assignment and mark the status as 'Blocked' by no human resource since seems currently nobody from Tools group would like to take over.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1761922019-01-03T08:44:41Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Blocked</i> to <i>Workable</i></li></ul><p>Commonly we use the status "Blocked" only in relation with a blocking ticket and a person tracking the blocked status so that this person is also automatically informed about ticket updates and can then update the ticket. Setting back to "Workable" which IMHO is the most suitable for a task that is in principle "workable" but no one picked it up, ok with that?</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1762072019-01-03T09:23:05Zcachencachen@suse.com
<ul><li><strong>Status</strong> changed from <i>Workable</i> to <i>New</i></li></ul><p>okurz wrote:</p>
<blockquote>
<p>Commonly we use the status "Blocked" only in relation with a blocking ticket and a person tracking the blocked status so that this person is also automatically informed about ticket updates and can then update the ticket. Setting back to "Workable" which IMHO is the most suitable for a task that is in principle "workable" but no one picked it up, ok with that?</p>
</blockquote>
<p>I don't want to judge whether this is 'workable' or not since this seems is a 'hard' task, let's leave the statue to 'New' until someone can take it and they can mark it to 'workable' or 'in progress' or others :)</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1914322019-02-21T08:42:07Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed" href="/issues/46964">action #46964</a>: [functional][u][s390x] test fails in the middle of execution (not installation) as incomplete with "half-open socket?" – connection to machine vanished?</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1938022019-02-28T02:58:22Zcachencachen@suse.com
<ul></ul><p>New happens in VT reboot_after_installation step: </p>
<p><a href="https://openqa.nue.suse.com/tests/2503084" class="external">https://openqa.nue.suse.com/tests/2503084</a><br>
<a href="https://openqa.nue.suse.com/tests/2504185" class="external">https://openqa.nue.suse.com/tests/2504185</a></p>
<p>Does it caused by the rewrite in commit 023c4c09dca87d17b3cec325f3adb5288525a211 ?</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=1938172019-02-28T05:08:08Zcachencachen@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed" href="/issues/48482">action #48482</a>: [ipmi][functional][u] test fails in reboot_after_installation; The console isn't responding correctly. Maybe half-open socket</i> added</li></ul> openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=2001442019-03-18T09:30:08Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Resolved</i></li><li><strong>Assignee</strong> set to <i>okurz</i></li></ul><p><a href="https://openqa.nue.suse.com/tests/2503084#next_previous" class="external">https://openqa.nue.suse.com/tests/2503084#next_previous</a> and the last ten jobs in this scenario are all green. <a class="issue tracker-4 status-3 priority-5 priority-high3 closed" title="action: [ipmi][functional][u] test fails in reboot_after_installation; The console isn't responding corre... (Resolved)" href="https://progress.opensuse.org/issues/48482">#48482</a> as well as <a class="issue tracker-4 status-3 priority-6 priority-high2 closed" title="action: [sle][functional][u][s390x][kvm] test fails in reboot_after_installation - "The console isn't res... (Resolved)" href="https://progress.opensuse.org/issues/48260">#48260</a> should have solved this. <a href="https://github.com/os-autoinst/os-autoinst/pull/1120" class="external">https://github.com/os-autoinst/os-autoinst/pull/1120</a> in particular should help with a better feedback in case of errors what went wrong. In most cases it was problems introduced by the tests that caused "half-open sockets" however not being obvious what caused it. We can not easily prevent future test code changes to introduce the same symptom again however the hint in the error message should make it more obvious what the test writer is missing or has done wrong. In short: The most likely problem is that the tests try to still access a console while the SUT is in reboot or shutdown. This can be prevented by explicitly disabling stall detection on these consoles or terminating ssh-based console connections before triggering a reboot or shutdown.</p>
openQA Tests - action #32746: [sle][tools][remote-backends][hard] Incomplete job because console isn't responding correctly. Half-open socket on IPMIhttps://progress.opensuse.org/issues/32746?journal_id=2600992019-11-27T08:59:37Zszarate
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-12 priority-4 priority-default" href="/issues/60161">action #60161</a>: [network][qem] auto_review:"The console.*(root-virtio-terminal1|sut).*is not responding.*half-open socket" test incompletes in t20_teaming_ab_all_link</i> added</li></ul>