action #96010
closed
[qem] test fails in hawk_gui acquiring a lock as the support server ended prematurely after a '503 response: Service Unavailable; URL was http://openqa.suse.de/api/v1/mm/children'
Added by martinsmac over 3 years ago.
Updated about 3 years ago.
Category:
Regressions/Crashes
Description
Observation¶
openQA test in scenario sle-15-SP1-Server-DVD-HA-Incidents-x86_64-qam_ha_hawk_client@64bit fails in
hawk_gui
Test suite description¶
The base test suite is used for job templates defined in YAML documents. It has no settings of its own.
Reproducible¶
Fails since (at least) Build :20208:fence-agents
Expected result¶
Last good: :20487:novnc (or more recent)
Further details¶
Always latest result in this scenario: latest
The test is failing in several steps, the most common is this one from the ticket. Need more research to determine what the problem is.
- Project changed from openQA Tests (public) to openQA Project (public)
- Subject changed from [qem] test fails in hawk_gui to [qem] test fails in hawk_gui acquiring a lock as the support server ended prematurely after a '503 response: Service Unavailable; URL was http://openqa.suse.de/api/v1/mm/children'
- Category changed from Bugs in existing tests to Regressions/Crashes
- Status changed from New to In Progress
- Assignee set to okurz
- Priority changed from Normal to High
- Target version set to Ready
not seen in before but what I see here:
[2021-07-26T12:47:42.719 CEST] [debug] Waiting for 3 jobs to finish
[2021-07-26T12:47:43.734 CEST] [debug] get_children: 503 response: Service Unavailable; URL was http://openqa.suse.de/api/v1/mm/children
[2021-07-26T12:47:43.734 CEST] [debug] Waiting for 0 jobs to finish
so it looks like the worker failed to reach osd and treated that as not needing to wait anymore. I will take a deeper look
- Due date set to 2021-08-09
- Status changed from In Progress to Feedback
So in the above PR I added a fix, better logging output, more test coverage. What could we do about monitoring or our processes?
- Status changed from Feedback to Resolved
no problems identified after deployment. Suggested follow-up improvement: #96191
- Status changed from Resolved to Feedback
Looks the same indeed. Tho I'm surprised to see this after 2 months - did this go unnoticed or did it only start happening again? 🤔️
cdywan wrote:
Looks the same indeed. Tho I'm surprised to see this after 2 months - did this go unnoticed or did it only start happening again? 🤔️
I would say it started to happen again, I didn't see this kind of fail in between last 2 months.
Today only one fail on aggregates. https://openqa.suse.de/tests/7170466
- Copied to action #98940: mmapi calls can still fail despite retries added
- Due date deleted (
2021-08-09)
- Status changed from Feedback to Resolved
I will try with longer and more retries in #98940
Also available in: Atom
PDF