Project

General

Profile

action #104085

openQABot pipeline failed with terminating connection due to administrator command size:S

Added by cdywan 5 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
2021-12-16
Due date:
% Done:

0%

Estimated time:

Description

openQABot pipeline failed like so:

ERROR: Unexpected response: {'errors': [{'message': 'terminating connection due to administrator command\nERROR:  server conn crashed?\nFATAL:  server conn crashed?\nserver closed the connection unexpectedly\n\tThis probably means the server terminated abnormally\n\tbefore or while processing the request.\n', 'locations': [{'line': 1, 'column': 2}], 'path': ['requests']}], 'data': {'requests': None}}
ERROR:root:Something bad happended during reading MR data from SMELT/IBS

Related issues

Related to openQA Infrastructure - action #105169: Pipeline of openQABot project fails with "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" causing alert/notificationResolved2022-01-20

Copied to openQA Infrastructure - action #105603: openQABot pipeline failed: "ERROR:root:Something bad happended during reading MR data from SMELT/IBS: Expecting value: line 4 column 1 (char 3)" size:MResolved2021-12-16

History

#1 Updated by cdywan 5 months ago

#2 Updated by cdywan 5 months ago

  • Status changed from New to Feedback

I guess it's fine now

#3 Updated by jbaier_cz 5 months ago

It is Thursday and the connection to SMELT failed. I am not sure if it is worthy to even file a ticket about this (before the retrigger). We might think about moving the schedule time outside the maintenance window.

#4 Updated by okurz 5 months ago

  • Status changed from Feedback to New
  • Target version set to Ready

Yes, I had already proposed to completely disable all triggering of openQA jobs on OSD in #102716 but no agreement was found so far. I think what we should be able to do here is to retry for long enough so that we can cover outages on non-reachable times of other machines.

For example in openqabot/update/mr.py in _get_mr_requests where we call requests.get use the internal retry feature of requests, similar to how we did in https://github.com/os-autoinst/openqa_review/pull/149/files

Also see https://findwork.dev/blog/advanced-usage-python-requests-timeouts-retries-hooks/

#5 Updated by okurz 5 months ago

  • Priority changed from Normal to High

#6 Updated by cdywan 5 months ago

  • Subject changed from openQABot pipeline failed with terminating connection due to administrator command to openQABot pipeline failed with terminating connection due to administrator command size:S
  • Description updated (diff)
  • Status changed from New to Workable

#7 Updated by jbaier_cz 5 months ago

  • Status changed from Workable to In Progress
  • Assignee set to jbaier_cz

#8 Updated by jbaier_cz 5 months ago

  • Status changed from In Progress to Feedback

Basic retry implemented with a simple wrapper call: https://gitlab.suse.de/qa-maintenance/openQABot/-/merge_requests/88

#9 Updated by okurz 5 months ago

  • Description updated (diff)
  • Status changed from Feedback to Resolved

merged. https://gitlab.suse.de/qa-maintenance/openQABot/-/jobs/750202 showed that at least nothing severly broke :) I don't think we need to keep this ticket open until we hit a problem with the network again. Looks good so far. Thanks!

#10 Updated by okurz 4 months ago

  • Related to action #105169: Pipeline of openQABot project fails with "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" causing alert/notification added

#11 Updated by nicksinger 4 months ago

  • Copied to action #105603: openQABot pipeline failed: "ERROR:root:Something bad happended during reading MR data from SMELT/IBS: Expecting value: line 4 column 1 (char 3)" size:M added

Also available in: Atom PDF