Project

General

Profile

action #95822

qa-maintenance/openQABot failed to trigger aggregate tests with "urllib.error.HTTPError: HTTP Error 500: Internal Server Error"

Added by okurz 10 months ago. Updated 10 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
Start date:
2021-07-22
Due date:
% Done:

0%

Estimated time:

Description

Observation

From https://chat.suse.de/channel/qem-openqa-review?msg=ifWGbs7QXfJdGJqTf

From ssh qam2 'journalctl -M openqabot -u openqabot-full --since=2021-07-22':

Jul 22 01:09:50 openqabot oqaqambot[21718]: INFO: Updates shedule enabled for this run on PUBCLOUD12SP5AZUREStandardgen2:x86_64
Jul 22 01:09:50 openqabot oqaqambot[21718]: INFO: sle-12-SP5-x86_64 repohash: 4a870c348452ec6fb6c9ca52b30d9aea
Jul 22 01:09:50 openqabot oqaqambot[21718]: INFO: Incidents in sle-12-SP5-x86_64: {'ARCH': 'x86_64',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'BUILD': '20210722-1',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'DISTRI': 'sle',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'FLAVOR': 'AZURE-Standard-gen2-Updates',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'OS_TEST_ISSUES': '20175,20204,20222,20248,20258,20283,20344,20353,20354,20431,20434,20450,20475,20477,20485,4705',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'PUBLICCLOUD_TOOLS_IMAGE_QUERY': 'https://openqa.suse.de/group_overview/276.json',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'PUBLIC_CLOUD_AZURE_OFFER': 'sles-12-sp5',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'PUBLIC_CLOUD_AZURE_SKU': 'gen2',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'PUBLIC_CLOUD_IMAGE_ID': '',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'REPOHASH': '4a870c348452ec6fb6c9ca52b30d9aea',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'SDK_TEST_ISSUES': '20204,20222,20225,20248,20274,20283,20326,20344,20354,20434,20450,20475,20477',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  'VERSION': '12-SP5',
Jul 22 01:09:50 openqabot oqaqambot[21718]:  '_OBSOLETE': 1}
Jul 22 01:09:57 openqabot oqaqambot[21718]: WARNING: PUBCLOUD12SP5AZUREBasic is outdated: 20210721-1
Jul 22 01:09:57 openqabot oqaqambot[21718]: INFO: Updates shedule enabled for this run on PUBCLOUD12SP5AZUREBasic:x86_64
Jul 22 01:09:57 openqabot oqaqambot[21718]: INFO: sle-12-SP5-x86_64 repohash: 4a870c348452ec6fb6c9ca52b30d9aea
Jul 22 01:09:57 openqabot oqaqambot[21718]: INFO: Incidents in sle-12-SP5-x86_64: {'ARCH': 'x86_64',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'BUILD': '20210722-1',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'DISTRI': 'sle',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'FLAVOR': 'AZURE-Basic-Updates',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'OS_TEST_ISSUES': '20175,20204,20222,20248,20258,20283,20344,20353,20354,20431,20434,20450,20475,20477,20485,4705',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'PUBLICCLOUD_TOOLS_IMAGE_QUERY': 'https://openqa.suse.de/group_overview/276.json',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'PUBLIC_CLOUD_AZURE_OFFER': 'sles-12-sp5-basic',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'PUBLIC_CLOUD_AZURE_SKU': 'gen1',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'PUBLIC_CLOUD_IMAGE_ID': '',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'REPOHASH': '4a870c348452ec6fb6c9ca52b30d9aea',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'SDK_TEST_ISSUES': '20204,20222,20225,20248,20274,20283,20326,20344,20354,20434,20450,20475,20477',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  'VERSION': '12-SP5',
Jul 22 01:09:57 openqabot oqaqambot[21718]:  '_OBSOLETE': 1}
Jul 22 01:10:02 openqabot oqaqambot[21718]: Traceback (most recent call last):
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/bin/oqaqambot", line 11, in <module>
Jul 22 01:10:02 openqabot oqaqambot[21718]:     load_entry_point('openQABot==0.3.0', 'console_scripts', 'oqaqambot')()
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib/python3.6/site-packages/openqabot/main.py", line 18, in main
Jul 22 01:10:02 openqabot oqaqambot[21718]:     sys.exit(run_bot(logger, args, sys))
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib/python3.6/site-packages/openqabot/main.py", line 41, in run_bot
Jul 22 01:10:02 openqabot oqaqambot[21718]:     return OpenQABot(metadata, args)()
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib/python3.6/site-packages/openqabot/openqabot.py", line 110, in __call__
Jul 22 01:10:02 openqabot oqaqambot[21718]:     self.calculate_updates()
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib/python3.6/site-packages/openqabot/openqabot.py", line 142, in calculate_updates
Jul 22 01:10:02 openqabot oqaqambot[21718]:     incidents = updates.gather_incidents(self.apiurl, arch)
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib/python3.6/site-packages/openqabot/update/updates.py", line 109, in gather_incidents
Jul 22 01:10:02 openqabot oqaqambot[21718]:     req = self.is_incident_in_testing(apiurl, incident)
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib/python3.6/site-packages/openqabot/update/updates.py", line 59, in is_incident_in_testing
Jul 22 01:10:02 openqabot oqaqambot[21718]:     res = osc.core.search(apiurl, request=xpath)["request"]
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib/python3.6/site-packages/osc/core.py", line 6819, in search
Jul 22 01:10:02 openqabot oqaqambot[21718]:     f = http_GET(u)
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib/python3.6/site-packages/osc/core.py", line 3421, in http_GET
Jul 22 01:10:02 openqabot oqaqambot[21718]:     def http_GET(*args, **kwargs):    return http_request('GET', *args, **kwargs)
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib/python3.6/site-packages/osc/core.py", line 3410, in http_request
Jul 22 01:10:02 openqabot oqaqambot[21718]:     fd = urlopen(req, data=data)
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib64/python3.6/urllib/request.py", line 223, in urlopen
Jul 22 01:10:02 openqabot oqaqambot[21718]:     return opener.open(url, data, timeout)
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib64/python3.6/urllib/request.py", line 532, in open
Jul 22 01:10:02 openqabot oqaqambot[21718]:     response = meth(req, response)
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib64/python3.6/urllib/request.py", line 642, in http_response
Jul 22 01:10:02 openqabot oqaqambot[21718]:     'http', request, response, code, msg, hdrs)
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib64/python3.6/urllib/request.py", line 570, in error
Jul 22 01:10:02 openqabot oqaqambot[21718]:     return self._call_chain(*args)
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
Jul 22 01:10:02 openqabot oqaqambot[21718]:     result = func(*args)
Jul 22 01:10:02 openqabot oqaqambot[21718]:   File "/usr/lib64/python3.6/urllib/request.py", line 650, in http_error_default
Jul 22 01:10:02 openqabot oqaqambot[21718]:     raise HTTPError(req.full_url, code, msg, hdrs, fp)
Jul 22 01:10:02 openqabot oqaqambot[21718]: urllib.error.HTTPError: HTTP Error 500: Internal Server Error
Jul 22 01:10:03 openqabot systemd[1]: openqabot-full.service: Main process exited, code=exited, status=1/FAILURE
Jul 22 01:10:03 openqabot systemd[1]: Failed to start Schedule and review Maintenance incidents in openQA full run.

Problem

Could be a temporary performance problem on openqa.suse.de. In any case a retry should be conducted.

Suggestions

  • Logs on openqa.suse.de should be checked for the same timestamp (beware of timezone differences)
  • Implement retry, potentially even possible on systemd service level but that could cause lots of redundant jobs triggered if there is just a minor failure after many jobs had been triggered

Workaround

On qam2: Trigger systemctl -M openqabot start openqabot-full manually. Caution: Takes multiple minutes, better do that in a screen session and monitor journalctl -M openqabot -u openqabot-full -f


Related issues

Related to openQA Project - action #101274: openQABot pipeline failed with NewConnectionErrorNew2021-10-21

Related to openQA Infrastructure - action #105169: Pipeline of openQABot project fails with "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" causing alert/notificationResolved2022-01-20

History

#1 Updated by okurz 10 months ago

  • Description updated (diff)

On qam2 I manually triggered systemctl -M openqabot start openqabot-full which finished after about 20m, ~3k jobs triggered. According to osukup in https://chat.suse.de/channel/qem-openqa-review?msg=ZAfjko7b5dD9RQsn9 the error 500 came from an osc call so it was build.suse.de being down. As coolo stated in https://chat.suse.de/channel/qem-openqa-review?msg=SQjopeH6LdvbY39Jx the IBS backend was down "1:33-1:35 am" which would explain the error 500 but openQABot reported the problem from "1:10" (timezone unclear).

#2 Updated by okurz 4 months ago

  • Related to action #101274: openQABot pipeline failed with NewConnectionError added

#3 Updated by okurz 4 months ago

  • Related to action #105169: Pipeline of openQABot project fails with "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" causing alert/notification added

Also available in: Atom PDF