action #99396
closedIncompletes with auto_review:"api failure: Failed to register .* 503":retry should be restarted automatically
Description
motivation¶
As we saw in #99153 many of these jobs can occur if the web UI is busy. It makes most sense to restart those jobs. (It is a bit unfortunate that the communication with the web UI does not work here but then this problem can be propagated after all. Otherwise these jobs would show up as abandoned and would already be restarted.)
Steps to reproduce¶
Find jobs referencing this ticket with the help of
https://raw.githubusercontent.com/os-autoinst/scripts/master/openqa-query-for-job-label ,
call openqa-query-for-job-label poo#99396
acceptance criteria¶
- AC1: Incomplete jobs with a reason matching the regex from the ticket title (or possibly a less strict version) should be restarted automatically.
suggestions¶
- Try to configure this via settings in
openqa.ini
. - Otherwise consider extending the code.
Updated by mkittler about 3 years ago
- Related to action #99153: [Alerting] Incomplete jobs (not restarted) of last 24h alert on 2021-9-24 added
Updated by okurz about 3 years ago
- Category set to Regressions/Crashes
- Status changed from New to In Progress
- Assignee set to okurz
- Target version set to Ready
Updated by okurz about 3 years ago
- Due date set to 2021-10-12
- Status changed from In Progress to Feedback
Updated by okurz about 3 years ago
- Copied to action #99402: Incompletes with "backend died: Error connecting to VNC server.*: IO::Socket::INET: connect: Connection timed out":retry should be restarted automatically added
Updated by okurz about 3 years ago
- Description updated (diff)
- Status changed from Feedback to Resolved
The command openqa-query-for-job-label poo#99396
shows
7334911|2021-10-07 10:44:49|done|incomplete|cryptlvm|api failure: Failed to register at openqa.suse.de - 503 response: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime …|openqaworker5
7334911|2021-10-07 10:44:49|done|incomplete|cryptlvm|api failure: Failed to register at openqa.suse.de - 503 response: …|openqaworker5
7334911|2021-10-07 10:44:49|done|incomplete|cryptlvm|api failure: Failed to register at openqa.suse.de - 503 response: …|openqaworker5
7330026|2021-10-07 10:44:42|done|incomplete|create_hdd_ha_textmode_maintenance|api failure: Failed to register at …|openqaworker6
7330026|2021-10-07 10:44:42|done|incomplete|create_hdd_ha_textmode_maintenance|api failure: Failed to register at …|openqaworker6
7330501|2021-10-07 10:44:29|done|incomplete|mru-install-minimal-with-addons|api failure: Failed to register at openqa.suse.de - …|openqaworker13
7330501|2021-10-07 10:44:29|done|incomplete|mru-install-minimal-with-addons|api failure: Failed to register at openqa.suse.de - …|openqaworker13
7334927|2021-10-07 10:44:13|done|incomplete|qam-regression-installation-SLED|api failure: Failed to register at openqa.suse.de …|openqaworker13
7334927|2021-10-07 10:44:13|done|incomplete|qam-regression-installation-SLED|api failure: Failed to register at openqa.suse.de …|openqaworker13
7334927|2021-10-07 10:44:13|done|incomplete|qam-regression-installation-SLED|api failure: Failed to register at openqa.suse.de …|openqaworker13
Same as in #99402 after checking the first above job I am concluding that the automatic retriggering within openQA does its job. The other ticket I will monitor. Here I feel we can resolve directly