action #105169: Pipeline of openQABot project fails with "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" causing alert/notification - openQA Infrastructure (public) - openSUSE Project Management Tool

Actions

Copy link

action #105169

closed

Pipeline of openQABot project fails with "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" causing alert/notification

Added by mkittler over 3 years ago. Updated over 3 years ago.

Status:

Resolved

Priority:

Low

Assignee:

jbaier_cz

Category:

Target version:

openQA Project (public) - Ready

Start date:

2022-01-20

Due date:

% Done:

Estimated time:

Tags:

alert

Description

Observation¶

See ticket description
Restarting helped to let the job pass
Pipeline: https://gitlab.suse.de/qa-maintenance/openQABot/-/pipelines/302125
Job: https://gitlab.suse.de/qa-maintenance/openQABot/-/jobs/799531

Problem¶

It does not look like a retry was attempted. It is also not clear how the alert should be handled. Restarting the job helped to let it pass, not sure whether it is generally here required, though.

Acceptance criteria¶

AC1: The alert is actionable (e.g. there's some information about it on https://progress.opensuse.org/projects/qa/wiki#Alert-handling) or suppressed if there's nothing to be done.
AC2: The actual problem is avoided if possible (e.g. by adding a retry)

Related issues 3 (1 open — 2 closed)

Actions

Copy link

Updated by okurz over 3 years ago

Status changed from New to In Progress
Assignee set to okurz
Target version set to Ready

I will look into retrying

Actions

Copy link

Updated by okurz over 3 years ago

Related to action #95822: qa-maintenance/openQABot failed to trigger aggregate tests with "urllib.error.HTTPError: HTTP Error 500: Internal Server Error" added

Actions

Copy link

Updated by okurz over 3 years ago

Related to action #101274: openQABot pipeline failed with NewConnectionError added

Actions

Copy link

Updated by okurz over 3 years ago

Related to action #104085: openQABot pipeline failed with terminating connection due to administrator command size:S added

Actions

Copy link

Updated by okurz over 3 years ago

Status changed from In Progress to New
Assignee deleted (~~okurz~~)
Priority changed from Normal to Low

@jbaier_cz you implemented retry for internal request calls in bbf4e04 . Looking at the callstack in https://gitlab.suse.de/qa-maintenance/openQABot/-/jobs/799531 it looks like now we have a problem within the osc python library. WDYT, should we do a custom retry loop or patch the osc library for retrying?

Actions

Copy link

Updated by jbaier_cz over 3 years ago

Yeah, looking at the stack self.commentapi.get_comments(**kwargs) pretty much shows, that the error came from python-osc, probably we hit a moment where OBS was unresponsive or redeployed or something. It might be probably nicer, if we add the retry into the osc lib itself as we use it on many places (and other projects as well).

In the meantime as a workaround, we can probably automatically retrigger the pipeline itself. But I am not sure if this problems occurs so often to even bother ourselves (with the workaround).

Actions

Copy link

Updated by jbaier_cz over 3 years ago

Status changed from New to In Progress
Assignee set to jbaier_cz

Actions

Copy link

Updated by jbaier_cz over 3 years ago

Status changed from In Progress to Feedback

Turned out that I do not want to touch the code much, the retry would be needed in osc/core.py which is using urllib.requests.urlopen() and some other stuff around. Instead I decided to create a wrapper for the http_GET function which is exposed to the comments.py. As I wanted to do a minimum to none changes to the comments.py, I kept the current XML parsing (which expects file handle to open).

A created https://gitlab.suse.de/qa-maintenance/openQABot/-/merge_requests/91 to address the issues.

Actions

Copy link

Updated by jbaier_cz over 3 years ago

Status changed from Feedback to Resolved

Code is merged and running, so far without issues. In the future, we should see more detailed info about the problem and also it should auto-retry first.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public) » openQA Infrastructure (public)

Tags

Custom queries

action #105169

Pipeline of openQABot project fails with "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" causing alert/notification

Observation¶

Problem¶

Acceptance criteria¶

Updated by okurz over 3 years ago

Updated by okurz over 3 years ago

Updated by okurz over 3 years ago

Updated by okurz over 3 years ago

Updated by okurz over 3 years ago

Updated by jbaier_cz over 3 years ago

Updated by jbaier_cz over 3 years ago

Updated by jbaier_cz over 3 years ago

Updated by jbaier_cz over 3 years ago