Project

General

Profile

Actions

action #156322

closed

zabbix-proxy.dmz-prg2.suse.org not reachable from ariel.suse-dmz.opensuse.org

Added by jbaier_cz about 2 months ago. Updated 23 days ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
-
Target version:
Start date:
2024-02-29
Due date:
% Done:

0%

Estimated time:

Description

Observation

Zabbix proxy is not reachable from ariel, hence the monitoring of that host is not working at all.

Error message from zabbix frontend: Received empty response from Zabbix Agent at [10.150.1.11]. Assuming that agent dropped connection because of access permissions.

new-ariel #  ping -c3 zabbix-proxy.dmz-prg2.suse.org
PING zabbix-proxy.dmz-prg2.suse.org (10.150.1.22) 56(84) bytes of data.
From ariel.suse-dmz.opensuse.org (10.150.1.11) icmp_seq=1 Destination Host Unreachable
From ariel.suse-dmz.opensuse.org (10.150.1.11) icmp_seq=2 Destination Host Unreachable
From ariel.suse-dmz.opensuse.org (10.150.1.11) icmp_seq=3 Destination Host Unreachable

--- zabbix-proxy.dmz-prg2.suse.org ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2045ms

Related issues 3 (2 open1 closed)

Related to openQA Infrastructure - action #133358: Migration of o3 VM to PRG2 - Ensure IPv6 is fully workingBlockedokurz

Actions
Related to openQA Infrastructure - action #136274: Failing DNS resolution on o3 for hosts like github.comResolvedokurz2023-09-21

Actions
Blocks openQA Infrastructure - action #40196: [monitoring] monitor internal port 9526, port 80, external port 443 accessibility of o3 and response times size:MBlockedjbaier_cz2018-08-23

Actions
Actions #1

Updated by jbaier_cz about 2 months ago

  • Blocks action #40196: [monitoring] monitor internal port 9526, port 80, external port 443 accessibility of o3 and response times size:M added
Actions #2

Updated by jbaier_cz about 2 months ago ยท Edited

  • Status changed from New to Blocked

Created SD-149785

Actions #3

Updated by jbaier_cz about 2 months ago

Apparently the problem is here for a long time already: https://zabbix.suse.de/tr_events.php?triggerid=115371&eventid=681670480; the same is also visible on the other side (zabbix-proxy.dmz-prg2.suse.org) where the corresponding problem is https://zabbix.suse.de/tr_events.php?triggerid=114242&eventid=681662162

Actions #4

Updated by jbaier_cz about 2 months ago

  • Priority changed from Normal to High
Actions #5

Updated by livdywan about 2 months ago

jbaier_cz wrote in #note-2:

Created SD-149785

Put a message in Slack to confirm if someone can triage this ticket at least (it got flagged on our SLO)

Actions #6

Updated by okurz about 2 months ago

  • Priority changed from High to Low

@Liv Dywan I appreciate you trying to remedy our SLO alerts but I think we can do better. You brought up the relevant SD ticket in chat but there are certain reasons why people don't look into that SD ticket yet and I feel we are just distracting them by using a side-communication channel for a not so severe issue and also it's making you busy. Here I think we should have changed the priority of our ticket to Low as soon as we created the SD ticket as when we are honest no real harm is done if the SD ticket is not being worked on over the next weeks. Instead in the meantime you can focus on more important tasks where we can actually make a difference. Reducing ticket to "Low" accordingly now.

Actions #7

Updated by okurz about 1 month ago

  • Target version changed from Ready to Tools - Next
Actions #8

Updated by jbaier_cz 23 days ago

  • Related to action #133358: Migration of o3 VM to PRG2 - Ensure IPv6 is fully working added
Actions #9

Updated by jbaier_cz 23 days ago

  • Related to action #136274: Failing DNS resolution on o3 for hosts like github.com added
Actions #10

Updated by jbaier_cz 23 days ago

  • Status changed from Blocked to Resolved
  • Target version changed from Tools - Next to Ready

So it turned out the issue was there because of the following:

  • the proxy was moved into another network
  • we had a static record for the proxy in /etc/hosts
  • dnsmasq configuration on ariel was not resolving reverse records for 10.in-addr.arpa. zone
  • the IPv6 was broken (in fact completely missing) for eth1

I did the following changes to remedy the situation:

  1. host record for zabbix-proxy.dmz-prg2.suse.org was deleted from /etc/hosts, so the forward record is correctly answered
  2. configuration for dnsmasq was modified to include server=/10.in-addr.arpa/10.151.53.53 (so things like dig -x 10.151.15.2 works correctly and we can resolve the reverse record for the proxy)
  3. wicked configuration now includes address 2a07:de40:b281:1:10:150:1:11/64 for eth1, because proxy has IPv6 address and zabbix wants to use it
  4. extra route (similarly to the IPv4 setup) 2a07:de40:b280:15::/64 via 2a07:de40:b281:1:ffff:ffff:ffff:ffff dev eth1 was added to made the proxy reachable via IPv6
Actions

Also available in: Atom PDF