Project

General

Profile

Actions

action #99396

closed

Incompletes with auto_review:"api failure: Failed to register .* 503":retry should be restarted automatically

Added by mkittler about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2021-09-28
Due date:
2021-10-12
% Done:

0%

Estimated time:

Description

motivation

As we saw in #99153 many of these jobs can occur if the web UI is busy. It makes most sense to restart those jobs. (It is a bit unfortunate that the communication with the web UI does not work here but then this problem can be propagated after all. Otherwise these jobs would show up as abandoned and would already be restarted.)

Steps to reproduce

Find jobs referencing this ticket with the help of
https://raw.githubusercontent.com/os-autoinst/scripts/master/openqa-query-for-job-label ,
call openqa-query-for-job-label poo#99396

acceptance criteria

  • AC1: Incomplete jobs with a reason matching the regex from the ticket title (or possibly a less strict version) should be restarted automatically.

suggestions

  • Try to configure this via settings in openqa.ini.
  • Otherwise consider extending the code.

Related issues 2 (0 open2 closed)

Related to openQA Project (public) - action #99153: [Alerting] Incomplete jobs (not restarted) of last 24h alert on 2021-9-24Resolvedmkittler2021-09-24

Actions
Copied to openQA Project (public) - action #99402: Incompletes with "backend died: Error connecting to VNC server.*: IO::Socket::INET: connect: Connection timed out":retry should be restarted automaticallyResolvedokurz2021-09-282021-10-22

Actions
Actions #1

Updated by mkittler about 3 years ago

  • Related to action #99153: [Alerting] Incomplete jobs (not restarted) of last 24h alert on 2021-9-24 added
Actions #2

Updated by okurz about 3 years ago

  • Category set to Regressions/Crashes
  • Status changed from New to In Progress
  • Assignee set to okurz
  • Target version set to Ready
Actions #3

Updated by okurz about 3 years ago

  • Due date set to 2021-10-12
  • Status changed from In Progress to Feedback
Actions #4

Updated by okurz about 3 years ago

  • Copied to action #99402: Incompletes with "backend died: Error connecting to VNC server.*: IO::Socket::INET: connect: Connection timed out":retry should be restarted automatically added
Actions #5

Updated by okurz about 3 years ago

merged

Actions #6

Updated by okurz about 3 years ago

  • Description updated (diff)
  • Status changed from Feedback to Resolved

The command openqa-query-for-job-label poo#99396 shows

7334911|2021-10-07 10:44:49|done|incomplete|cryptlvm|api failure: Failed to register at openqa.suse.de - 503 response: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime …|openqaworker5
7334911|2021-10-07 10:44:49|done|incomplete|cryptlvm|api failure: Failed to register at openqa.suse.de - 503 response: …|openqaworker5
7334911|2021-10-07 10:44:49|done|incomplete|cryptlvm|api failure: Failed to register at openqa.suse.de - 503 response: …|openqaworker5
7330026|2021-10-07 10:44:42|done|incomplete|create_hdd_ha_textmode_maintenance|api failure: Failed to register at …|openqaworker6
7330026|2021-10-07 10:44:42|done|incomplete|create_hdd_ha_textmode_maintenance|api failure: Failed to register at …|openqaworker6
7330501|2021-10-07 10:44:29|done|incomplete|mru-install-minimal-with-addons|api failure: Failed to register at openqa.suse.de - …|openqaworker13
7330501|2021-10-07 10:44:29|done|incomplete|mru-install-minimal-with-addons|api failure: Failed to register at openqa.suse.de - …|openqaworker13
7334927|2021-10-07 10:44:13|done|incomplete|qam-regression-installation-SLED|api failure: Failed to register at openqa.suse.de …|openqaworker13
7334927|2021-10-07 10:44:13|done|incomplete|qam-regression-installation-SLED|api failure: Failed to register at openqa.suse.de …|openqaworker13
7334927|2021-10-07 10:44:13|done|incomplete|qam-regression-installation-SLED|api failure: Failed to register at openqa.suse.de …|openqaworker13

Same as in #99402 after checking the first above job I am concluding that the automatic retriggering within openQA does its job. The other ticket I will monitor. Here I feel we can resolve directly

Actions

Also available in: Atom PDF