Project

General

Profile

action #105169

Pipeline of openQABot project fails with "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" causing alert/notification

Added by mkittler 4 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Low
Assignee:
Target version:
Start date:
2022-01-20
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Observation

Problem

It does not look like a retry was attempted. It is also not clear how the alert should be handled. Restarting the job helped to let it pass, not sure whether it is generally here required, though.

Acceptance criteria


Related issues

Related to QA - action #95822: qa-maintenance/openQABot failed to trigger aggregate tests with "urllib.error.HTTPError: HTTP Error 500: Internal Server Error"New2021-07-22

Related to openQA Project - action #101274: openQABot pipeline failed with NewConnectionErrorNew2021-10-21

Related to openQA Infrastructure - action #104085: openQABot pipeline failed with terminating connection due to administrator command size:SResolved2021-12-16

History

#1 Updated by okurz 4 months ago

  • Status changed from New to In Progress
  • Assignee set to okurz
  • Target version set to Ready

I will look into retrying

#2 Updated by okurz 4 months ago

  • Related to action #95822: qa-maintenance/openQABot failed to trigger aggregate tests with "urllib.error.HTTPError: HTTP Error 500: Internal Server Error" added

#3 Updated by okurz 4 months ago

  • Related to action #101274: openQABot pipeline failed with NewConnectionError added

#4 Updated by okurz 4 months ago

  • Related to action #104085: openQABot pipeline failed with terminating connection due to administrator command size:S added

#5 Updated by okurz 4 months ago

  • Status changed from In Progress to New
  • Assignee deleted (okurz)
  • Priority changed from Normal to Low

jbaier_cz you implemented retry for internal request calls in bbf4e04 . Looking at the callstack in https://gitlab.suse.de/qa-maintenance/openQABot/-/jobs/799531 it looks like now we have a problem within the osc python library. WDYT, should we do a custom retry loop or patch the osc library for retrying?

#6 Updated by jbaier_cz 4 months ago

Yeah, looking at the stack self.commentapi.get_comments(**kwargs) pretty much shows, that the error came from python-osc, probably we hit a moment where OBS was unresponsive or redeployed or something. It might be probably nicer, if we add the retry into the osc lib itself as we use it on many places (and other projects as well).

In the meantime as a workaround, we can probably automatically retrigger the pipeline itself. But I am not sure if this problems occurs so often to even bother ourselves (with the workaround).

#7 Updated by jbaier_cz 3 months ago

  • Status changed from New to In Progress
  • Assignee set to jbaier_cz

#8 Updated by jbaier_cz 3 months ago

  • Status changed from In Progress to Feedback

Turned out that I do not want to touch the code much, the retry would be needed in osc/core.py which is using urllib.requests.urlopen() and some other stuff around. Instead I decided to create a wrapper for the http_GET function which is exposed to the comments.py. As I wanted to do a minimum to none changes to the comments.py, I kept the current XML parsing (which expects file handle to open).

A created https://gitlab.suse.de/qa-maintenance/openQABot/-/merge_requests/91 to address the issues.

#9 Updated by jbaier_cz 3 months ago

  • Status changed from Feedback to Resolved

Code is merged and running, so far without issues. In the future, we should see more detailed info about the problem and also it should auto-retry first.

Also available in: Atom PDF