Project

General

Profile

Actions

action #138551

closed

DNS outage of 2023-10-25, e.g. Cron <root@openqa-service> (date; fetch_openqa_bugs)> /tmp/fetch_openqa_bugs_osd.log Max retries exceeded with url size:S

Added by livdywan about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Start date:
2023-10-23
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Observation

Exception occured while fetching poo#119818
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 160, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/usr/lib/python3.6/site-packages/urllib3/util/connection.py", line 61, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib64/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 677, in urlopen
    chunked=chunked,
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 381, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 978, in _validate_conn
    conn.connect()
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 309, in connect
    conn = self._new_conn()
  File "/usr/lib/python3.6/site-packages/urllib3/connection.py", line 172, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f52ccb85048>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 727, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 439, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='progress.opensuse.org', port=443): Max retries exceeded with url: /issues/119818.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f52ccb85048>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/fetch_openqa_bugs", line 62, in <module>
    raise e
  File "/usr/bin/fetch_openqa_bugs", line 48, in <module>
    issue = issue_fetcher.get_issue(bugid)
  File "/usr/lib/python3.6/site-packages/openqa_bugfetcher/issues/__init__.py", line 88, in get_issue
    return self.prefix_table[prefix](self.conf, bugid)
  File "/usr/lib/python3.6/site-packages/openqa_bugfetcher/issues/__init__.py", line 24, in __init__
    self.fetch(conf)
  File "/usr/lib/python3.6/site-packages/openqa_bugfetcher/issues/progress_issue.py", line 12, in fetch
    req = requests.get(url, headers={"X-Redmine-API-Key": conf["progress"]["api_key"]}, timeout=10)
  File "/usr/lib/python3.6/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 532, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='progress.opensuse.org', port=443): Max retries exceeded with url: /issues/119818.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f52ccb85048>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))

There's two for progress.opensuse.org and one with bugs.kde.org that almost looks the same so it seems more about network access from the host than issues on the other side.


Related issues 2 (0 open2 closed)

Related to openQA Infrastructure (public) - action #138527: Zabbix agent on ariel.dmz-prg2.suse.org reported no data for 30m and there is nothing in the journal size:SResolvedlivdywan2023-07-07

Actions
Related to openQA Infrastructure (public) - action #138545: Munin - minion hook failed - opensuse.org :: openqa.opensuse.org size:SResolvedtinita2023-11-28

Actions
Actions #1

Updated by okurz about 1 year ago

  • Tags set to infra
  • Subject changed from Cron <root@openqa-service> (date; fetch_openqa_bugs)> /tmp/fetch_openqa_bugs_osd.log Max retries exceeded with url to DNS outage of 2023-10-25, e.g. Cron <root@openqa-service> (date; fetch_openqa_bugs)> /tmp/fetch_openqa_bugs_osd.log Max retries exceeded with url
  • Priority changed from High to Urgent

This is most certainly linked to bigger problems in the network also affecting users, e.g. imap.suse.de

Actions #2

Updated by jbaier_cz about 1 year ago

  • Related to action #138527: Zabbix agent on ariel.dmz-prg2.suse.org reported no data for 30m and there is nothing in the journal size:S added
Actions #3

Updated by livdywan about 1 year ago

  • Subject changed from DNS outage of 2023-10-25, e.g. Cron <root@openqa-service> (date; fetch_openqa_bugs)> /tmp/fetch_openqa_bugs_osd.log Max retries exceeded with url to DNS outage of 2023-10-25, e.g. Cron <root@openqa-service> (date; fetch_openqa_bugs)> /tmp/fetch_openqa_bugs_osd.log Max retries exceeded with url size:S
  • Status changed from New to In Progress
  • Assignee set to livdywan
  • Priority changed from Urgent to High

This was being discussed in Slack https://app.slack.com/client/T02863RC2AC/C029APBKLGK and may be related to DCT migration - I will try and find out more, and at least track the issue. We discussed it briefly and probably can't do anything on our side.

It's likely already solved. I'll update here if I see more cases.

Actions #4

Updated by livdywan about 1 year ago

  • Status changed from In Progress to Feedback

Hasn't come back so far

Actions #5

Updated by livdywan about 1 year ago

  • Related to action #138545: Munin - minion hook failed - opensuse.org :: openqa.opensuse.org size:S added
Actions #6

Updated by livdywan about 1 year ago

  • Status changed from Feedback to Resolved

I'd say there's nothing more to be done here.

Actions

Also available in: Atom PDF