action #105169
closedPipeline of openQABot project fails with "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" causing alert/notification
0%
Description
Observation¶
- See ticket description
- Restarting helped to let the job pass
- Pipeline: https://gitlab.suse.de/qa-maintenance/openQABot/-/pipelines/302125
- Job: https://gitlab.suse.de/qa-maintenance/openQABot/-/jobs/799531
Problem¶
It does not look like a retry was attempted. It is also not clear how the alert should be handled. Restarting the job helped to let it pass, not sure whether it is generally here required, though.
Acceptance criteria¶
- AC1: The alert is actionable (e.g. there's some information about it on https://progress.opensuse.org/projects/qa/wiki#Alert-handling) or suppressed if there's nothing to be done.
- AC2: The actual problem is avoided if possible (e.g. by adding a retry)
Updated by okurz almost 3 years ago
- Status changed from New to In Progress
- Assignee set to okurz
- Target version set to Ready
I will look into retrying
Updated by okurz almost 3 years ago
- Related to action #95822: qa-maintenance/openQABot failed to trigger aggregate tests with "urllib.error.HTTPError: HTTP Error 500: Internal Server Error" added
Updated by okurz almost 3 years ago
- Related to action #101274: openQABot pipeline failed with NewConnectionError added
Updated by okurz almost 3 years ago
- Related to action #104085: openQABot pipeline failed with terminating connection due to administrator command size:S added
Updated by okurz almost 3 years ago
- Status changed from In Progress to New
- Assignee deleted (
okurz) - Priority changed from Normal to Low
@jbaier_cz you implemented retry for internal request calls in bbf4e04 . Looking at the callstack in https://gitlab.suse.de/qa-maintenance/openQABot/-/jobs/799531 it looks like now we have a problem within the osc python library. WDYT, should we do a custom retry loop or patch the osc library for retrying?
Updated by jbaier_cz almost 3 years ago
Yeah, looking at the stack self.commentapi.get_comments(**kwargs)
pretty much shows, that the error came from python-osc, probably we hit a moment where OBS was unresponsive or redeployed or something. It might be probably nicer, if we add the retry into the osc lib itself as we use it on many places (and other projects as well).
In the meantime as a workaround, we can probably automatically retrigger the pipeline itself. But I am not sure if this problems occurs so often to even bother ourselves (with the workaround).
Updated by jbaier_cz almost 3 years ago
- Status changed from New to In Progress
- Assignee set to jbaier_cz
Updated by jbaier_cz almost 3 years ago
- Status changed from In Progress to Feedback
Turned out that I do not want to touch the code much, the retry would be needed in osc/core.py which is using urllib.requests.urlopen() and some other stuff around. Instead I decided to create a wrapper for the http_GET function which is exposed to the comments.py. As I wanted to do a minimum to none changes to the comments.py, I kept the current XML parsing (which expects file handle to open).
A created https://gitlab.suse.de/qa-maintenance/openQABot/-/merge_requests/91 to address the issues.
Updated by jbaier_cz almost 3 years ago
- Status changed from Feedback to Resolved
Code is merged and running, so far without issues. In the future, we should see more detailed info about the problem and also it should auto-retry first.