action #163781
closed
Jobs randomly fail with unspecified "api failure", there should be more details in the error message size:S
Added by MDoucha 8 months ago.
Updated 6 months ago.
Category:
Feature requests
- Is duplicate of action #162038: No HTTP Response on OSD on 10-06-2024 - auto_review:".*timestamp mismatch - check whether clocks on the local host and the web UI host are in sync":retry size:S added
- Category set to Feature requests
- Target version set to Tools - Next
- Subject changed from Jobs randomly fail with unspecified "api failure" to Jobs randomly fail with unspecified "api failure", there should be more details in the error message
- Status changed from New to Resolved
I validated that the openQA changes are deployed and applied my config change manually (including restarting services) for now until our pipelines work again. Until now we don't see the new error message which is expected and good. We discussed that this should be sufficient for now and other alerts (e.g. number of new incomplete jobs) should alert us if the situation gets worse.
- Assignee set to nicksinger
- Target version changed from Tools - Next to Ready
- Is duplicate of deleted (action #162038: No HTTP Response on OSD on 10-06-2024 - auto_review:".*timestamp mismatch - check whether clocks on the local host and the web UI host are in sync":retry size:S)
- Related to action #162038: No HTTP Response on OSD on 10-06-2024 - auto_review:".*timestamp mismatch - check whether clocks on the local host and the web UI host are in sync":retry size:S added
- Status changed from Resolved to New
nicksinger wrote in #note-4:
I validated that the openQA changes are deployed and applied my config change manually (including restarting services) for now until our pipelines work again. Until now we don't see the new error message which is expected and good. We discussed that this should be sufficient for now and other alerts (e.g. number of new incomplete jobs) should alert us if the situation gets worse.
Seems like progress/redmine just took my last comment from the other ticket (https://progress.opensuse.org/issues/162038) and applied it here as well which is obviously not changing anything in here -> reopening
- Assignee deleted (
nicksinger)
- Related to action #164418: Distinguish "timestamp mismatch" from cases where webUI is slow or where clocks are really differing added
- Target version changed from Ready to Tools - Next
- Subject changed from Jobs randomly fail with unspecified "api failure", there should be more details in the error message to Jobs randomly fail with unspecified "api failure", there should be more details in the error message size:S
- Description updated (diff)
- Status changed from New to Workable
- Target version changed from Tools - Next to Ready
- Priority changed from Normal to Low
- Status changed from Workable to In Progress
- Assignee set to mkittler
- Status changed from In Progress to Feedback
- Status changed from Feedback to Resolved
With the PR merged I don't think we'll see jobs with just "api failure" anymore. If I missed cases we can reopen the ticket. I cannot check the cases of the jobs mentioned in the ticket description specifically because they're 404.
Also available in: Atom
PDF