Project

General

Profile

Actions

action #105169

closed

Pipeline of openQABot project fails with "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" causing alert/notification

Added by mkittler almost 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
-
Start date:
2022-01-20
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Observation

Problem

It does not look like a retry was attempted. It is also not clear how the alert should be handled. Restarting the job helped to let it pass, not sure whether it is generally here required, though.

Acceptance criteria


Related issues 3 (1 open2 closed)

Related to QA (public) - action #95822: qa-maintenance/openQABot failed to trigger aggregate tests with "urllib.error.HTTPError: HTTP Error 500: Internal Server Error"Resolvedokurz2021-07-22

Actions
Related to openQA Project (public) - action #101274: openQABot pipeline failed with NewConnectionErrorNew2021-10-21

Actions
Related to openQA Infrastructure (public) - action #104085: openQABot pipeline failed with terminating connection due to administrator command size:SResolvedjbaier_cz2021-12-16

Actions
Actions #1

Updated by okurz almost 3 years ago

  • Status changed from New to In Progress
  • Assignee set to okurz
  • Target version set to Ready

I will look into retrying

Actions #2

Updated by okurz almost 3 years ago

  • Related to action #95822: qa-maintenance/openQABot failed to trigger aggregate tests with "urllib.error.HTTPError: HTTP Error 500: Internal Server Error" added
Actions #3

Updated by okurz almost 3 years ago

  • Related to action #101274: openQABot pipeline failed with NewConnectionError added
Actions #4

Updated by okurz almost 3 years ago

  • Related to action #104085: openQABot pipeline failed with terminating connection due to administrator command size:S added
Actions #5

Updated by okurz almost 3 years ago

  • Status changed from In Progress to New
  • Assignee deleted (okurz)
  • Priority changed from Normal to Low

@jbaier_cz you implemented retry for internal request calls in bbf4e04 . Looking at the callstack in https://gitlab.suse.de/qa-maintenance/openQABot/-/jobs/799531 it looks like now we have a problem within the osc python library. WDYT, should we do a custom retry loop or patch the osc library for retrying?

Actions #6

Updated by jbaier_cz almost 3 years ago

Yeah, looking at the stack self.commentapi.get_comments(**kwargs) pretty much shows, that the error came from python-osc, probably we hit a moment where OBS was unresponsive or redeployed or something. It might be probably nicer, if we add the retry into the osc lib itself as we use it on many places (and other projects as well).

In the meantime as a workaround, we can probably automatically retrigger the pipeline itself. But I am not sure if this problems occurs so often to even bother ourselves (with the workaround).

Actions #7

Updated by jbaier_cz almost 3 years ago

  • Status changed from New to In Progress
  • Assignee set to jbaier_cz
Actions #8

Updated by jbaier_cz almost 3 years ago

  • Status changed from In Progress to Feedback

Turned out that I do not want to touch the code much, the retry would be needed in osc/core.py which is using urllib.requests.urlopen() and some other stuff around. Instead I decided to create a wrapper for the http_GET function which is exposed to the comments.py. As I wanted to do a minimum to none changes to the comments.py, I kept the current XML parsing (which expects file handle to open).

A created https://gitlab.suse.de/qa-maintenance/openQABot/-/merge_requests/91 to address the issues.

Actions #9

Updated by jbaier_cz almost 3 years ago

  • Status changed from Feedback to Resolved

Code is merged and running, so far without issues. In the future, we should see more detailed info about the problem and also it should auto-retry first.

Actions

Also available in: Atom PDF