https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842022-03-24T09:29:13ZopenSUSE Project Management ToolopenQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5039182022-03-24T09:29:13Zokurzokurz@suse.com
<ul><li><strong>Project</strong> changed from <i>openQA Tests</i> to <i>openQA Infrastructure</i></li><li><strong>Category</strong> deleted (<del><i>Infrastructure</i></del>)</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5039242022-03-24T09:37:13Zokurzokurz@suse.com
<ul></ul><p>On openqaw5-xen.qa.suse.de I installed tmux and within a tmux session I started in splits <code>mtr scc.suse.com</code>, <code>mtr openqaw5-xen</code>, <code>mtr 2620:113:80c0:80a0:10:162:0:1</code>, <code>mtr 10.162.0.19</code> that is qanetnue.qa.suse.de</p>
<p>We found that there is no loss to scc.suse.com and no loss to download.opensuse.org but a 30% loss to both IPv6 and IPv4 of qanet from openqaw5-xen.qa.suse.de. We started <code>mtr qanet14nue.qa.suse.de</code> which looks ok. An ssh connection from openqaw5-xen.qa.suse.de to qanet14nue.qa.suse.de stopped after some seconds after connection from the command <code>ssh -oKexAlgorithms=+diffie-hellman-group1-sha1 -oHostKeyAlgorithms=+ssh-rsa admin@10.162.0.74</code>.</p>
<p>I started a screen session on osd running an ssh connection to qanet14nue, the rack switch for openqaw5-xen.qa.suse.de, see <a href="https://racktables.nue.suse.com/index.php?page=rack&rack_id=928" class="external">https://racktables.nue.suse.com/index.php?page=rack&rack_id=928</a> and in there running <code>ssh qanet count 0</code> which resolves to IPv4 running a continous ping. I see gaps in the connection:</p>
<pre><code>18 bytes from 10.162.0.1: icmp_seq=1. time=0 ms
… (no outage)
18 bytes from 10.162.0.1: icmp_seq=975. time=0 ms
18 bytes from 10.162.0.1: icmp_seq=976. time=0 ms
PING: no reply from 10.162.0.1
… (repeated)
PING: no reply from 10.162.0.1
PING: timeout
18 bytes from 10.162.0.1: icmp_seq=990. time=0 ms
… (no outage)
18 bytes from 10.162.0.1: icmp_seq=1016. time=0 ms
PING: no reply from 10.162.0.1
PING: timeout
… (repeated)
PING: timeout
18 bytes from 10.162.0.1: icmp_seq=1028. time=0 ms
</code></pre>
<p>and the cycle repeats so intermittent outages on the connection. Next step, check connection between qanet14 and the switch in the rack of qanet, that is qanet15. And I see the same outages. So there is a problem in the connection between qanet14 and qanet15. I suggest to crosscheck between other two switches and then bisect.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5039722022-03-24T10:18:37Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li><li><strong>Priority</strong> changed from <i>High</i> to <i>Urgent</i></li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5039992022-03-24T10:41:57Zokurzokurz@suse.com
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-4 status-1 priority-4 priority-default child" href="/issues/108872">action #108872</a>: Outdated information on openqaw5-xen https://racktables.suse.de/index.php?page=object&tab=default&object_id=3468</i> added</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5041072022-03-24T13:41:43Zokurzokurz@suse.com
<ul></ul><p>The information in racktables was not up to date. The current position for openqaw5-xen.qa.suse.de is (obviously) not in the QA labs anymore. It's in <a href="https://racktables.nue.suse.com/index.php?page=rack&rack_id=520" class="external">NUE-SRV2-B-3</a>. I moved the machine racktables. So the switch that openqaw5-xen.qa.suse.de is connected to is actually qanet20nue.qa.suse.de. That switch feels partially unresponsive when we execute any command there, e.g. <code>show system</code> to show general system parameters, compared to other switches. We found that the switch likely runs for already 490 days so we tried to trigger a reboot with the <code>reload</code> command. That got stuck so we decided that the switch should be manually power cycled with physical access. After that we can check if the responsiveness changes, if there is still packet loss from openqaw5-xen.qa.suse.de to qanet.qa.suse.de . If there is, then cleanup switch configuration, crosscheck with other switches, other machines in the same rack connected to the same switch, etc.</p>
<p>According to<br>
<a href="https://www.cisco.com/c/de_de/support/switches/sg300-28-28-port-gigabit-managed-switch/model.html#~tab-downloads" class="external">https://www.cisco.com/c/de_de/support/switches/sg300-28-28-port-gigabit-managed-switch/model.html#~tab-downloads</a><br>
there is still a year of support for the switches we use as products.<br>
So additionally to the above we should document the important configuration from all switches that we maintain, e.g. crosscheck information in racktables about ports, VLAN entries, uplink port aggregation, etc.. Then ensure to have updated firmware on all our switches, clean out old entries, e.g. for machines that are not even there anymore, maybe some factory resets and start from scratch to configure the switches in a clean way.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5041522022-03-24T14:38:02Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-1 priority-4 priority-default child" href="/issues/108266">action #108266</a>: grenache: script_run() commands randomly time out since server room move</i> added</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5042362022-03-24T21:13:04Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed child" href="/issues/108896">action #108896</a>: [ppc64le] auto_review:"(?s)Size of.*differs, expected.*but downloaded.*Download.*failed: 521 Connect timeout":retry</i> added</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5042752022-03-24T21:44:49Zokurzokurz@suse.com
<ul><li><strong>Subject</strong> changed from <i>Network performance problems, DNS, DHCP, within SUSE QA network</i> to <i>Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"Error connecting to VNC server.*openqaw5-xen.*Connection timed out":retry but also other symptoms</i></li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5042932022-03-24T21:50:01Zokurzokurz@suse.com
<ul><li><strong>Subject</strong> changed from <i>Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"Error connecting to VNC server.*openqaw5-xen.*Connection timed out":retry but also other symptoms</i> to <i>Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*openqaw5-xen.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms</i></li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5042992022-03-24T21:55:27Zokurzokurz@suse.com
<ul><li><strong>Subject</strong> changed from <i>Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*openqaw5-xen.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms</i> to <i>Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms</i></li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5044852022-03-25T04:12:04Zopenqa_reviewopenqa-review@suse.de
<ul><li><strong>Due date</strong> set to <i>2022-04-08</i></li></ul><p>Setting due date based on mean cycle time of SUSE QE Tools</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5046502022-03-25T10:00:58Znicksingernsinger@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>So the switch that openqaw5-xen.qa.suse.de is connected to is actually qanet20nue.qa.suse.de. That switch feels partially unresponsive when we execute any command there, e.g. <code>show system</code> to show general system parameters, compared to other switches. We found that the switch likely runs for already 490 days so we tried to trigger a reboot with the <code>reload</code> command. That got stuck so we decided that the switch should be manually power cycled with physical access.</p>
</blockquote>
<p>I did so yesterday evening. The switch came up again and our address-table is now more populated then before. I could confirm that openqaw5-xen is connected to port 47 on that switch (physically as well as with <code>show mac address-table address 0c:c4:7a:*:c2</code>). It doesn't crash any more if you access the address-table but it isn't really that much more responsive. This could be because qanet20 has way more ports then our reference switches have.</p>
<p>When I run a ping from that switch to qanet I still see a loss of data every now and then for some seconds. It is directly connected to the core switch:</p>
<pre><code>qanet20nue#show cdp neighbors
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
S - Switch, H - Host, I - IGMP, r - Repeater, P - VoIP Phone
M - Remotely-Managed Device, C - CAST Phone Port,
W - Two-Port MAC Relay
Device ID Local Adv Time To Capability Platform Port ID
Interface Ver. Live
------------------ ----------- ---- ------- ---------- ------------ -----------
Nx5696Q-Core1.mgmt gi51 2 146 R S C N5K-C5696Q Ethernet101
.suse.de.mgmt.suse /1/39
.de.mgmt.suse.de.m
gmt.suse.de(FOC222
0R2B1)
Nx5696Q-Core2.mgmt gi52 2 147 R S C N5K-C5696Q Ethernet102
.suse.de.mgmt.suse /1/39
.de.mgmt.suse.de.m
gmt.suse.de(FOC222
2R07L)
</code></pre> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5047432022-03-25T13:13:55Znicksingernsinger@suse.com
<ul></ul><p>I ran a few more pings from and between qanet20 and qanet15. w5-xen connects to qanet20, qanet connects to qanet15.<br>
Both switches are get their upstream from Nx5696Q-Core1.mgmt.suse.de and Nx5696Q-Core2.mgmt.suse.de.</p>
<p>Ping from qanet15 -> qanet: everything fine, no timeouts visible after over 7k pings<br>
Ping from qanet20 -> qanet15: everything fine, no timeouts visible after over 6.5k pings<br>
Ping from qanet20 -> qanet: timeouts every couple of seconds for 1-2 seconds. Several timeouts observed in ~2k pings</p>
<p>For me this indicates somehow that qanet15 struggles to switch requests to qanet.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5049172022-03-26T14:20:12Zokurzokurz@suse.com
<ul></ul><p>Given that we "discovered" the HTML configuration pages for <a href="http://qanet15nue.qa.suse.de/" class="external">http://qanet15nue.qa.suse.de/</a> and I assume then that all other if not all switches have that as well I recommend we</p>
<ul>
<li>Review the configuration and settings on all switches</li>
<li>Ensure to have enabled time synced clocks</li>
<li>Backup current configuration</li>
<li>Reboot all switches to give them a fresh start</li>
<li>Conduct ping tests between all switches</li>
<li>Update configuration, e.g. mailing list addresses for current contact persons</li>
</ul>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5049502022-03-26T15:23:22Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed child" href="/issues/108953">action #108953</a>: [tools] Performance issues in some s390 workers</i> added</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5051242022-03-28T06:21:45Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-6 priority-high2 closed" href="/issues/109028">action #109028</a>: [openqa][worker][sut] Very severe stability and connectivity issues of openqa workers and suts</i> added</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5051482022-03-28T07:00:21Znicksingernsinger@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed" href="/issues/108737">action #108737</a>: [sle][security][backlog][powerVM]test fails in bootloader, any issue with install server or network performance issue?</i> added</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5051842022-03-28T07:57:30Zokurzokurz@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-6 priority-high2 closed" href="/issues/109055">action #109055</a>: Broken workers alert</i> added</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5052292022-03-28T08:34:14Znicksingernsinger@suse.com
<ul></ul><p>I asked Gerhard in a private slack message if he can check the core switch for qanet15 (the one qanet is attached to). What we saw:</p>
<p>core1: no errors<br>
core2: 16359 output errors<br>
both show ~15k "output discards"</p>
<p>We both don't know what these error counters indicate. But he was kind enough to remove the second link of qanet15 (the one to core2) for the time being.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5052502022-03-28T08:38:07Znicksingernsinger@suse.com
<ul></ul><p>nicksinger wrote:</p>
<blockquote>
<p>Ping from qanet20 -> qanet: timeouts every couple of seconds for 1-2 seconds. Several timeouts observed in ~2k pings</p>
</blockquote>
<p>After the change I see a 100% success rate pinging from qanet20 -> qanet:</p>
<pre><code>1000 packets transmitted, 1000 packets received, 0% packet loss
</code></pre> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5052712022-03-28T08:57:36Znicksingernsinger@suse.com
<ul></ul><p>We also saw:</p>
<p>Core1: 0error 470k discard<br>
Core2: 6150error 76k discard</p>
<p>for qanet13 (where powerqaworker-qam-1.qa.suse.de) is connected to. I previously saw a package loss of ~50%. Now after removing the connection to core2 I get a pretty constant ping with only ~0.1% loss</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5052922022-03-28T09:17:38Znicksingernsinger@suse.com
<ul></ul><p>I created <a href="https://sd.suse.com/servicedesk/customer/portal/1/SD-81499" class="external">https://sd.suse.com/servicedesk/customer/portal/1/SD-81499</a> for a proper fix.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5053522022-03-28T09:45:31Zokurzokurz@suse.com
<ul></ul><p>Thank you.<br>
I sent an announcement to <a href="https://suse.slack.com/archives/C02CANHLANP/p1648459597479859" class="external">https://suse.slack.com/archives/C02CANHLANP/p1648459597479859</a> as well:</p>
<blockquote>
<p>@here update about the QA lab related openQA network problems that seem to have appeared or significantly increased since last week. We could pinpoint to the problem to network switches, in particular the core switches maintained by SUSE IT EngInfra providing connection for the QA switches. gschlotter (EngInfra) and nsinger (QE Tools) have decided to remove the second link between QA switch qanet15 and core2. Now we have 100% success rate pinging from the switches to the machine qanet. We will monitor the situation and coordinate with EngInfra to also look into the other switch combinations as well as further followup for the future. See <a href="https://progress.opensuse.org/issues/108845" class="external">https://progress.opensuse.org/issues/108845</a> for details.</p>
</blockquote>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5054212022-03-28T11:09:43Znicksingernsinger@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/505421/diff?detail_id=477761">diff</a>)</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5054362022-03-28T11:31:06Znicksingernsinger@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>Given that we "discovered" the HTML configuration pages for <a href="http://qanet15nue.qa.suse.de/" class="external">http://qanet15nue.qa.suse.de/</a> and I assume then that all other if not all switches have that as well I recommend we</p>
<ul>
<li>Review the configuration and settings on all switches</li>
<li>Ensure to have enabled time synced clocks</li>
<li>Backup current configuration</li>
<li>Reboot all switches to give them a fresh start</li>
<li>Conduct ping tests between all switches</li>
<li>Update configuration, e.g. mailing list addresses for current contact persons</li>
</ul>
</blockquote>
<p>Please be aware that a complete reset also kills our web and ssh access to the switches so we would need serial to initially configure them again. What I did for now is the following:</p>
<blockquote>
<ul>
<li>Review the configuration and settings on all switches</li>
<li>Ensure to have enabled time synced clocks</li>
<li>Update configuration, e.g. mailing list addresses for current contact persons (+ Location field update)</li>
</ul>
</blockquote>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5054422022-03-28T11:36:26Znicksingernsinger@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/505442/diff?detail_id=477773">diff</a>)</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5057932022-03-29T09:02:27Znicksingernsinger@suse.com
<ul><li><strong>Assignee</strong> changed from <i>okurz</i> to <i>nicksinger</i></li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5059222022-03-29T12:21:05Znicksingernsinger@suse.com
<ul></ul><p>nicksinger wrote:</p>
<blockquote>
<p>I created <a href="https://sd.suse.com/servicedesk/customer/portal/1/SD-81499" class="external">https://sd.suse.com/servicedesk/customer/portal/1/SD-81499</a> for a proper fix.</p>
</blockquote>
<p>One fiber was broken. It got replaced today and we enabled the 2nd link for qanet15 again. Pings to grenache (which is connected to qanet15) and qanet15 itself work flawlessly now.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5060392022-03-29T17:14:00Zokurzokurz@suse.com
<ul></ul><p>Ok, given that I think a good metric could be to run ping from one switch to another switch and also central components, so e.g.</p>
<pre><code>ssh -oKexAlgorithms=+diffie-hellman-group1-sha1 -oHostKeyAlgorithms=+ssh-rsa admin@10.162.0.74 "ping qanet count 600"
</code></pre>
<p>and check for any gaps. WDYT?</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5060632022-03-29T18:08:55Znicksingernsinger@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>Ok, given that I think a good metric could be to run ping from one switch to another switch and also central components, so e.g.</p>
<pre><code>ssh -oKexAlgorithms=+diffie-hellman-group1-sha1 -oHostKeyAlgorithms=+ssh-rsa admin@10.162.0.74 "ping qanet count 600"
</code></pre>
<p>and check for any gaps. WDYT?</p>
</blockquote>
<p>Yeah I forgot to mention that I already did this check right after Gerhard told me they fixed it. I did the ping to grenache and qanet from qanet20 and had 0% loss in 1k pings.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5063032022-03-30T07:34:15Zokurzokurz@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/506303/diff?detail_id=478532">diff</a>)</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5064772022-03-30T10:35:32Zokurzokurz@suse.com
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-4 status-3 priority-6 priority-high2 closed child" href="/issues/109241">action #109241</a>: Prefer to use domain names rather than IPv4 in salt pillars size:M</i> added</li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5064832022-03-30T10:35:49Zokurzokurz@suse.com
<ul></ul><p>nicksinger wrote:</p>
<blockquote>
<p>Yeah I forgot to mention that I already did this check right after Gerhard told me they fixed it. I did the ping to grenache and qanet from qanet20 and had 0% loss in 1k pings.</p>
</blockquote>
<p>Yes, and I mean to conduct this check as an automatic continuous monitoring step. Can we do that?</p>
<p>So far no more problems observed. We can now focus on introducing more monitoring checks in our infrastructure. I see two things as necessary before resolving: 1. Check if more recent jobs are still labeled by auto-review, 2. Have at least improvements planned in separate tickets, e.g. additional telegraf ping checks, mtr checks, monitoring for the switches etc. So if you move that out to separate tickets then that is covered.<br>
If no auto-review labeled jobs show up I assume DNS problems are gone. Right now <code>openqa-query-for-job-label poo#108845</code> returns</p>
<pre><code>8435334|2022-03-30 10:13:06|done|failed|gi-guest_developing-on-host_developing-xen||grenache-1
8435341|2022-03-30 10:11:09|done|failed|gi-guest_win2019-on-host_developing-kvm||grenache-1
8435342|2022-03-30 09:31:04|done|failed|virt-guest-migration-developing-from-developing-to-developing-kvm-dst||openqaworker2
8434694|2022-03-30 08:08:46|done|incomplete|qam-minimal-full|backend died: Error connecting to VNC server <s390qa105.qa.suse.de:5901>: IO::Socket::INET: connect: Connection timed out|openqaworker2
8430129|2022-03-29 11:33:42|done|failed|qam-minimal|backend done: Error connecting to VNC server <s390qa101.qa.suse.de:5901>: IO::Socket::INET: connect: Connection timed out|openqaworker2
8429701|2022-03-29 09:01:25|done|incomplete|qam-minimal-full|backend died: Error connecting to VNC server <s390qa106.qa.suse.de:5901>: IO::Socket::INET: connect: Connection timed out|openqaworker2
8429644|2022-03-29 08:35:44|done|incomplete|qam-minimal-full|backend died: Error connecting to VNC server <s390qa101.qa.suse.de:5901>: IO::Socket::INET: connect: Connection timed out|openqaworker2
8429626|2022-03-29 08:34:52|done|incomplete|jeos-containers|backend died: Error connecting to VNC server <openqaw5-xen-1.qa.suse.de:5911>: IO::Socket::INET: connect: Connection timed out|openqaworker2
8429631|2022-03-29 08:34:40|done|incomplete|jeos-filesystem|backend died: Error connecting to VNC server <openqaw5-xen-1.qa.suse.de:5914>: IO::Socket::INET: connect: Connection timed out|openqaworker2
8429627|2022-03-29 08:34:38|done|incomplete|jeos-base+sdk+desktop|backend died: Error connecting to VNC server <openqaw5-xen-1.qa.suse.de:5913>: IO::Socket::INET: connect: Connection timed out|openqaworker2
</code></pre>
<p>so recent results from today. The most recent is <a href="https://openqa.suse.de/tests/8435334#step/boot_from_pxe/9">https://openqa.suse.de/tests/8435334#step/boot_from_pxe/9</a> failing with</p>
<pre><code>Test died: Error connecting to <root@10.162.2.87>: No route to host at /usr/lib/os-autoinst/testapi.pm line 1761.
</code></pre>
<p>but this was not a job labeled by auto-review and it might even be unrelated. 10.162.2.87 is actually "quinn.qa.suse.de". Seems reverse DNS is missing as <code>host 10.162.2.87</code> returns NXDOMAIN. Also I wonder can we not just use domain names in the openQA worker config? I will try to handle that in <a class="issue tracker-4 status-3 priority-6 priority-high2 closed child" title="action: Prefer to use domain names rather than IPv4 in salt pillars size:M (Resolved)" href="https://progress.opensuse.org/issues/109241">#109241</a></p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5065162022-03-30T11:41:03Zokurzokurz@suse.com
<ul></ul><p>Discussed with nicksinger:</p>
<ul>
<li>This ticket will focus on rollback steps which nicksinger will carefully conduct and check with the inter-switch ping and ping to qanet from various sources</li>
<li>Clarify with mgmt about missing network monitoring in EngInfra domain -> #109250</li>
<li>Add monitoring, e.g. ping checks in telegraf from each openQA worker (or monitor.qa as source) to qanet.qa, dist.suse.de, download.opensuse.org, scc.suse.com, proxy.scc.suse.de -> <a class="issue tracker-4 status-3 priority-4 priority-default closed child" title="action: Add monitoring for SUSE QA network infrastructure size:M (Resolved)" href="https://progress.opensuse.org/issues/109253">#109253</a></li>
</ul>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5068702022-03-31T09:38:31Zmkittlermarius.kittler@suse.com
<ul><li><strong>Subject</strong> changed from <i>Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms</i> to <i>Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:M</i></li></ul> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5072932022-04-01T14:07:19Zgeorggkioulis@suse.com
<ul></ul><p>Are the following failures related to this ticket? (or should I open a ticket for performance issues with QA-Power8 kvm workers?)</p>
<p><a href="https://openqa.suse.de/tests/8450511#step/validate_partition_table_via_blkid/5" class="external">https://openqa.suse.de/tests/8450511#step/validate_partition_table_via_blkid/5</a> : blkid command times out (QA-Power8-5-kvm:7)<br>
<a href="https://openqa.suse.de/tests/8450952#step/yast2_system_settings/7" class="external">https://openqa.suse.de/tests/8450952#step/yast2_system_settings/7</a> : slow typing (QA-Power8-5-kvm:2)<br>
<a href="https://openqa.suse.de/tests/8450500#step/zypper_in/2" class="external">https://openqa.suse.de/tests/8450500#step/zypper_in/2</a> : slow typing (QA-Power8-4-kvm:4)<br>
<a href="https://openqa.suse.de/tests/8449706#step/shutdown/3" class="external">https://openqa.suse.de/tests/8449706#step/shutdown/3</a> : stall detected (QA-Power8-5-kvm:7)</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5073512022-04-02T01:28:46Zrfan1richard.fan@suse.com
<ul></ul><p><a class="user active user-mention" href="https://progress.opensuse.org/users/24624">@nicksinger</a><br>
Can you please help check this as well?<br>
<a href="https://openqa.suse.de/tests/8456680#step/firefox_nss/16" class="external">https://openqa.suse.de/tests/8456680#step/firefox_nss/16</a> [powerVM VNC console connection takes more time than other platforms]</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5076302022-04-04T09:26:50Znicksingernsinger@suse.com
<ul></ul><p>geor wrote:</p>
<blockquote>
<p>Are the following failures related to this ticket? (or should I open a ticket for performance issues with QA-Power8 kvm workers?)</p>
<p><a href="https://openqa.suse.de/tests/8450511#step/validate_partition_table_via_blkid/5" class="external">https://openqa.suse.de/tests/8450511#step/validate_partition_table_via_blkid/5</a> : blkid command times out (QA-Power8-5-kvm:7)<br>
<a href="https://openqa.suse.de/tests/8450952#step/yast2_system_settings/7" class="external">https://openqa.suse.de/tests/8450952#step/yast2_system_settings/7</a> : slow typing (QA-Power8-5-kvm:2)<br>
<a href="https://openqa.suse.de/tests/8450500#step/zypper_in/2" class="external">https://openqa.suse.de/tests/8450500#step/zypper_in/2</a> : slow typing (QA-Power8-4-kvm:4)<br>
<a href="https://openqa.suse.de/tests/8449706#step/shutdown/3" class="external">https://openqa.suse.de/tests/8449706#step/shutdown/3</a> : stall detected (QA-Power8-5-kvm:7)</p>
</blockquote>
<p>Hey <a class="user active user-mention" href="https://progress.opensuse.org/users/30196">@geor</a>, I checked the machines and both are connected to qanet13. Since this is the only switch where the second uplink is not restored yet (this was our previous workaround to fix the unstable connection) I don't think this is related.</p>
<p>rfan1 wrote:</p>
<blockquote>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/24624">@nicksinger</a><br>
Can you please help check this as well?<br>
<a href="https://openqa.suse.de/tests/8456680#step/firefox_nss/16" class="external">https://openqa.suse.de/tests/8456680#step/firefox_nss/16</a> [powerVM VNC console connection takes more time than other platforms]</p>
</blockquote>
<p>Could you please confirm that more tests are failing on redcurrant? As these machines are not connected to the qanet switches I think we might have a different problem here.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5076332022-04-04T09:29:26Znicksingernsinger@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Blocked</i></li></ul><p>The links on qanet{10,15,20} are restored. <code>show interfaces po 1</code> shows that 2 ports are active on each switch despite <code>cdp neighbor</code> data missing for the second link.<br>
I did a ping-check to 10.162.0.1 (qanet) and have 0% loss over 1000 packets.</p>
<p>I asked gerhard to restore the second link for qanet13 over slack but no response yet. Therefore setting this to blocked until I receive further feedback from him.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5076602022-04-04T09:51:31Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Blocked</i> to <i>Feedback</i></li></ul><p>Sounds good! Please use "Feedback" except for cases where there is another ticket which we can track, or anything public. "Feedback" can also mean "busy-waiting" with polling for a response :)</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5094512022-04-08T09:12:39Zlivdywanliv.dywan@suse.com
<ul><li><strong>Due date</strong> changed from <i>2022-04-08</i> to <i>2022-04-15</i></li></ul><p>Still waiting for confirmation from Gerhard</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5094572022-04-08T09:14:55Znicksingernsinger@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul><p>Got a response from Gerhard today and he brought back the second interface on qanet13. 1k pings to qanet (10.162.0.1) shows 0% loss. I'm therefore considering this here as done :)</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5130182022-04-23T00:40:20Zopenqa_reviewopenqa-review@suse.de
<ul></ul><p>This is an autogenerated message for openQA integration by the openqa_review script:</p>
<p>This bug is still referenced in a failing openQA test: sles+sdk+proxy_SCC_via_YaST_ncurses@ppc64le-hmc-single-disk<br>
<a href="https://openqa.suse.de/tests/8569074#step/addon_products_via_SCC_yast2_ncurses/1" class="external">https://openqa.suse.de/tests/8569074#step/addon_products_via_SCC_yast2_ncurses/1</a></p>
<p>To prevent further reminder comments one of the following options should be followed:</p>
<ol>
<li>The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted</li>
<li>The openQA job group is moved to "Released" or "EOL" (End-of-Life)</li>
<li>The bugref in the openQA scenario is removed or replaced, e.g. <code>label:wontfix:boo1234</code></li>
</ol>
<p>Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5132012022-04-24T17:06:44Zokurzokurz@suse.com
<ul></ul><p>I removed the comment from <a href="https://openqa.suse.de/tests/8569074#comments" class="external">https://openqa.suse.de/tests/8569074#comments</a> to not use this ticket as label.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5168802022-05-09T00:19:28Zopenqa_reviewopenqa-review@suse.de
<ul></ul><p>This is an autogenerated message for openQA integration by the openqa_review script:</p>
<p>This bug is still referenced in a failing openQA test: prj4_guest_upgrade_sles12sp5_on_sles12sp5-kvm<br>
<a href="https://openqa.suse.de/tests/8685507#step/boot_from_pxe/1" class="external">https://openqa.suse.de/tests/8685507#step/boot_from_pxe/1</a></p>
<p>To prevent further reminder comments one of the following options should be followed:</p>
<ol>
<li>The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted</li>
<li>The openQA job group is moved to "Released" or "EOL" (End-of-Life)</li>
<li>The bugref in the openQA scenario is removed or replaced, e.g. <code>label:wontfix:boo1234</code></li>
</ol>
<p>Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5253592022-06-06T00:28:35Zopenqa_reviewopenqa-review@suse.de
<ul></ul><p>This is an autogenerated message for openQA integration by the openqa_review script:</p>
<p>This bug is still referenced in a failing openQA test: prj4_guest_upgrade_sles12sp5_on_sles12sp5-kvm<br>
<a href="https://openqa.suse.de/tests/8751600#step/boot_from_pxe/1" class="external">https://openqa.suse.de/tests/8751600#step/boot_from_pxe/1</a></p>
<p>To prevent further reminder comments one of the following options should be followed:</p>
<ol>
<li>The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted</li>
<li>The openQA job group is moved to "Released" or "EOL" (End-of-Life)</li>
<li>The bugref in the openQA scenario is removed or replaced, e.g. <code>label:wontfix:boo1234</code></li>
</ol>
<p>Expect the next reminder at the earliest in 56 days if nothing changes in this ticket.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5255662022-06-06T15:28:42Zokurzokurz@suse.com
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>Feedback</i></li></ul><p>We need to handle the openqa-review comments</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5259922022-06-07T09:35:55Znicksingernsinger@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>We need to handle the openqa-review comments</p>
</blockquote>
<p>I deleted the two mentioned comments because the tests failing are not related to this ticket here. Anything else which needs to be done to "handle the openqa-review comments"?</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5261272022-06-07T12:53:10Zlivdywanliv.dywan@suse.com
<ul></ul><p>nicksinger wrote:</p>
<blockquote>
<p>okurz wrote:</p>
<blockquote>
<p>We need to handle the openqa-review comments</p>
</blockquote>
<p>I deleted the two mentioned comments because the tests failing are not related to this ticket here. Anything else which needs to be done to "handle the openqa-review comments"?</p>
</blockquote>
<p>They also don't seem to match the autoreview expression. I would assume this can be closed</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5262172022-06-07T17:26:06Zokurzokurz@suse.com
<ul></ul><p>nicksinger wrote:</p>
<blockquote>
<p>I deleted the two mentioned comments because the tests failing are not related to this ticket here. Anything else which needs to be done to "handle the openqa-review comments"?</p>
</blockquote>
<p>cdywan wrote:</p>
<blockquote>
<p>They also don't seem to match the autoreview expression. I would assume this can be closed</p>
</blockquote>
<p>Both does not help if there is carry-over happening. So would be good to follow the URLs pointing to this ticket and understand why there is a reference to this ticket. With the comments deleting that's a bit harder :) But querying the database and looking for jobs that have this ticket in a comment can help for that.</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5263372022-06-08T06:53:52Zlivdywanliv.dywan@suse.com
<ul></ul><p>okurz wrote:</p>
<blockquote>
<p>nicksinger wrote:</p>
<blockquote>
<p>I deleted the two mentioned comments because the tests failing are not related to this ticket here. Anything else which needs to be done to "handle the openqa-review comments"?</p>
</blockquote>
<p>cdywan wrote:</p>
<blockquote>
<p>They also don't seem to match the autoreview expression. I would assume this can be closed</p>
</blockquote>
<p>Both does not help if there is carry-over happening. So would be good to follow the URLs pointing to this ticket and understand why there is a reference to this ticket. With the comments deleting that's a bit harder :) But querying the database and looking for jobs that have this ticket in a comment can help for that.</p>
</blockquote>
<p>The comments have been deleted already - what else would cause a carry-over now?</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5263462022-06-08T06:57:50Zlivdywanliv.dywan@suse.com
<ul></ul><p>cdywan wrote:</p>
<blockquote>
<p>okurz wrote:</p>
<blockquote>
<p>nicksinger wrote:</p>
<blockquote>
<p>I deleted the two mentioned comments because the tests failing are not related to this ticket here. Anything else which needs to be done to "handle the openqa-review comments"?</p>
</blockquote>
<p>cdywan wrote:</p>
<blockquote>
<p>They also don't seem to match the autoreview expression. I would assume this can be closed</p>
</blockquote>
<p>Both does not help if there is carry-over happening. So would be good to follow the URLs pointing to this ticket and understand why there is a reference to this ticket. With the comments deleting that's a bit harder :) But querying the database and looking for jobs that have this ticket in a comment can help for that.</p>
</blockquote>
<p>The comments have been deleted already - what else would cause a carry-over now?</p>
</blockquote>
<p>I guess this answers my question:</p>
<pre><code>./openqa-query-for-job-label poo#108845
8766983|2022-05-16 19:31:03|done|incomplete|qam-minimal|backend died: Error connecting to VNC server <s390qa102.qa.suse.de:5901>: IO::Socket::INET: connect: timeout|openqaworker2
8751821|2022-05-13 23:35:51|done|failed|uefi-gi-guest_sles12sp5-on-host_developing-xen||openqaworker2
8737947|2022-05-12 07:43:09|done|failed|prj4_guest_upgrade_sles12sp5_on_sles12sp5-kvm||grenache-1
8729421|2022-05-10 12:46:06|done|failed|modify_existing_partition||grenache-1
8722867|2022-05-09 07:25:56|done|failed|qam-minimal-full|backend done: Error connecting to VNC server <s390qa104.qa.suse.de:5901>: IO::Socket::INET: connect: Connection timed out|openqaworker2
8722866|2022-05-09 07:25:44|done|incomplete|qam-minimal|backend died: Error connecting to VNC server <s390qa106.qa.suse.de:5901>: IO::Socket::INET: connect: Connection timed out|openqaworker2
</code></pre> openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5264152022-06-08T07:38:18Zlivdywanliv.dywan@suse.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul><p>cdywan wrote:</p>
<blockquote>
<p>okurz wrote:</p>
<blockquote>
<p>Both does not help if there is carry-over happening. So would be good to follow the URLs pointing to this ticket and understand why there is a reference to this ticket. With the comments deleting that's a bit harder :) But querying the database and looking for jobs that have this ticket in a comment can help for that.</p>
</blockquote>
</blockquote>
<p>All references are gone now</p>
openQA Infrastructure - action #108845: Network performance problems, DNS, DHCP, within SUSE QA network auto_review:"(Error connecting to VNC server.*qa.suse.*Connection timed out|ipmitool.*qa.suse.*Unable to establish)":retry but also other symptoms size:Mhttps://progress.opensuse.org/issues/108845?journal_id=5362252022-07-13T09:41:20Zokurzokurz@suse.com
<ul><li><strong>Due date</strong> deleted (<del><i>2022-04-15</i></del>)</li></ul>