action #157018
closed[sporadic] Build failed in Jenkins: submit-openQA-TW-to-oS_Fctry - Error 503: Service Unavailable size:S
Added by tinita 9 months ago. Updated 9 months ago.
0%
Description
Observation¶
Date: Sat, 9 Mar 2024 03:49:48 +0100 (CET)
See <http://jenkins.qa.suse.de/job/submit-openQA-TW-to-oS_Fctry/1001/display/redirect>
Changes:
------------------------------------------
[...truncated 4.20 MiB...]
<result project="devel:openQA:tested" repository="openSUSE_Factory" arch="x86_64" code="blocked" state="blocked">
<status package="openQA" code="blocked">
+ echo 'Waiting while openQA is in progress'
Waiting while openQA is in progress
...
4.6.1709822711.90519fe6
4.6.1709822711.90519fe6
4.6.1709822711.90519fe6
4.6.1709822711.90519fe6' openSUSE:Factory
Server returned an error: HTTP Error 503: Service Unavailable
Acceptance criteria¶
- AC1: Short unavailabilities of OBS are covered with retry
Suggestions¶
- Use https://build.opensuse.org/package/show/openSUSE:Factory/retry in the according script from github.com/os-autoinst/scripts/
Updated by tinita 9 months ago
- Category set to Regressions/Crashes
Observation¶
Date: Sat, 9 Mar 2024 03:49:48 +0100 (CET)
See <http://jenkins.qa.suse.de/job/submit-openQA-TW-to-oS_Fctry/1001/display/redirect>
Changes:
------------------------------------------
[...truncated 4.20 MiB...]
<result project="devel:openQA:tested" repository="openSUSE_Factory" arch="x86_64" code="blocked" state="blocked">
<status package="openQA" code="blocked">
+ echo 'Waiting while openQA is in progress'
Waiting while openQA is in progress
...
4.6.1709822711.90519fe6
4.6.1709822711.90519fe6
4.6.1709822711.90519fe6
4.6.1709822711.90519fe6' openSUSE:Factory
Server returned an error: HTTP Error 503: Service Unavailable
Updated by okurz 9 months ago
- Subject changed from Build failed in Jenkins: submit-openQA-TW-to-oS_Fctry - Error 503: Service Unavailable to [sporadic] Build failed in Jenkins: submit-openQA-TW-to-oS_Fctry - Error 503: Service Unavailable size:S
- Description updated (diff)
- Status changed from New to Workable
Updated by livdywan 9 months ago
- Status changed from Workable to In Progress
- Assignee set to livdywan
Suggestions¶
- Use https://build.opensuse.org/package/show/openSUSE:Factory/retry in the according script from github.com/os-autoinst/scripts/
I assume os-autoinst-obs-auto-submit:38 is the relevant code. Let's see if I can propose a trivial fix.
Updated by livdywan 9 months ago
livdywan wrote in #note-5:
Suggestions¶
- Use https://build.opensuse.org/package/show/openSUSE:Factory/retry in the according script from github.com/os-autoinst/scripts/
I assume os-autoinst-obs-auto-submit:38 is the relevant code. Let's see if I can propose a trivial fix.
https://github.com/os-autoinst/scripts/pull/300 correction, it is the osc call
Updated by openqa_review 9 months ago
- Due date set to 2024-03-29
Setting due date based on mean cycle time of SUSE QE Tools
Updated by livdywan 9 months ago
- Status changed from In Progress to Feedback
https://github.com/os-autoinst/scripts/pull/300 correction, it is the osc call
Merged. Let's see if it works fine. Would probably resolve it soon since we likely won't see another outage.
Updated by tinita 9 months ago
Had to revert it: https://github.com/os-autoinst/scripts/pull/301
Updated by jbaier_cz 9 months ago
- Status changed from Feedback to Workable
- Priority changed from Normal to High
Still need some update though, see the error:
+ retry -e osc co --server-side-source-service-files devel:openQA/openQA
retry: unrecognized option '--server-side-source-service-files'
usage: /usr/bin/retry [options] [cmd...]
options:
-h,--help Show this help
-r,--retries=RETRIES How many retries to do on command failure after
the initial try. Defaults to 3.
-s,--sleep=SLEEP How many seconds to sleep between retries.
Defaults to 3 seconds.
-e,--exponential[=FACTOR] Enable simple exponential back-off algorithm.
Disabled by default, factor defaults to 2
(binary exponential back-off).
Updated by okurz 9 months ago
- Status changed from Workable to In Progress
New PR was created and merged and verified in jenkins. There is still one problematic entry:
+++ retry -e -- osc cat openSUSE:Factory/openQA/openQA.changes
+++ grep 'Update to version'
+++ head -n1
Retrying up to 3 more times after sleeping 3s …
Retrying up to 2 more times after sleeping 6s …
Retrying up to 1 more times after sleeping 12s …
there shouldn't have been a retry here. Apparently there is a SIGPIPE due to the head. Try to reproduce with
retry -r0 -e -- osc cat openSUSE:Factory/openQA/openQA.changes | grep 'Update to version' | head -n1; echo "${PIPESTATUS[@]}"
- Update to version 4.6.1710762624.7d0dd225:
1 141 0
The grep 'Update to version' | head -n1
can actually be simplified to grep -m1 'Update to version'
but that does not yet fix the original problem:
retry -r0 -e -- osc cat openSUSE:Factory/openQA/openQA.changes | grep -m1 'Update to version'; echo "${PIPESTATUS[@]}"
- Update to version 4.6.1710762624.7d0dd225:
1 0
which gets rid of the sigpipe of grep but keeps the failure of osc cat. Then I found one other possibility:
grep -m1 'Update to version' <(retry -r0 -e -- osc cat openSUSE:Factory/openQA/openQA.changes); echo "${PIPESTATUS[@]}"
- Update to version 4.6.1710762624.7d0dd225:
0
Updated by livdywan 9 months ago
okurz wrote in #note-12:
New PR was created and merged and verified in jenkins. There is still one problematic entry:
+++ retry -e -- osc cat openSUSE:Factory/openQA/openQA.changes +++ grep 'Update to version' +++ head -n1 Retrying up to 3 more times after sleeping 3s … Retrying up to 2 more times after sleeping 6s … Retrying up to 1 more times after sleeping 12s …
there shouldn't have been a retry here. Apparently there is a SIGPIPE due to the head. Try to reproduce with
retry -r0 -e -- osc cat openSUSE:Factory/openQA/openQA.changes | grep 'Update to version' | head -n1; echo "${PIPESTATUS[@]}" - Update to version 4.6.1710762624.7d0dd225: 1 141 0
The
grep 'Update to version' | head -n1
can actually be simplified togrep -m1 'Update to version'
but that does not yet fix the original problem:retry -r0 -e -- osc cat openSUSE:Factory/openQA/openQA.changes | grep -m1 'Update to version'; echo "${PIPESTATUS[@]}" - Update to version 4.6.1710762624.7d0dd225: 1 0
which gets rid of the sigpipe of grep but keeps the failure of osc cat. Then I found one other possibility:
grep -m1 'Update to version' <(retry -r0 -e -- osc cat openSUSE:Factory/openQA/openQA.changes); echo "${PIPESTATUS[@]}" - Update to version 4.6.1710762624.7d0dd225: 0
bash: line 158: prefix: unbound variable
Apparently this broke elsewhere now.
Updated by okurz 9 months ago
- Assignee changed from livdywan to okurz
I wonder why http://jenkins.qa.suse.de/job/submit-openQA-TW-to-oS_Fctry/1010/console still mentions multiple "Retrying", looking into that. and the Retrying up to 2 more times after sleeping 6s … line is doubled. there seems to be some retry process going on in the background as there are lines like "Retrying up to 1 more times after sleeping 12s …" just intermixed with other content. ok, the process redirection seems to be a bad idea.
Updated by okurz 9 months ago
This reproduces the problem
0 $ grep -m1 'Update to version' <(retry -s0 -e -- osc cat openSUSE:Factory/openQA/openQA.changes)
- Update to version 4.6.1710845353.23e79984:
0 $ Retrying up to 3 more times after sleeping 0s …
Retrying up to 2 more times after sleeping 0s …
Retrying up to 1 more times after sleeping 0s …
so one can see the grep returning fine but then in the background retry output still piles up. In before I used retry -r0
so no retries would have been executed. With -s0
we still execute the retries but with no sleep time in between.
Updated by okurz 9 months ago · Edited
- Due date deleted (
2024-03-29) - Status changed from Feedback to In Progress
merged, http://jenkins.qa.suse.de/job/submit-openQA-TW-to-oS_Fctry/1011/console looks better now but in the end still failed.
Updated by openqa_review 9 months ago
- Due date set to 2024-04-04
Setting due date based on mean cycle time of SUSE QE Tools
Updated by okurz 9 months ago
https://github.com/os-autoinst/scripts/pull/312 merged. Forced execution of http://jenkins.qa.suse.de/job/submit-openQA-TW-to-oS_Fctry/1012/console now
Updated by okurz 9 months ago
- Due date deleted (
2024-04-04) - Status changed from In Progress to Resolved
http://jenkins.qa.suse.de/job/submit-openQA-TW-to-oS_Fctry/1012/console succeeded now and no mentions of "Retrying"