Actions
action #109028
closed[openqa][worker][sut] Very severe stability and connectivity issues of openqa workers and suts
Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2022-03-28
Due date:
% Done:
0%
Estimated time:
Description
Observation¶
Recent openqa environment for virtualization test run is really bad. I think it already becomes intolerable. Under current circumstances, virtualization functional test run can not finish completely and in time if any testing task with time constraints coming in.
- The last daily Build116.4 has not finished acceptance test run after lots of rerun.
- For any unfinished test suite on Build116.4 acceptance test run page, there are pages of rerun history for the test suite, for example, this one.
- I believe there is general environment issue, but openqaworker-2:18 and openqaworker-2:20 are the two which are being affected the most from observation. It seems that they nearly can not finish any test run assigned to them even with rerun.
Steps to reproduce¶
- Observe a newly triggered test run with a new daily build
- Manual rerun all failed test runs constantly
Problem¶
- It seems that only general environment issue can have such widespread and severe impact on openqa test run
- I am also aware of poo#108845. Not sure whether it is relevant.
Suggestion¶
- Check environment issues in openqa network, including sut machine status, server room situation, infrastructure connectivity and stability, network glitch and etc.
- Check openqa worker status and the machine on which openqa worker is running, especially openqaworker-2:18 and openqaworker-2:20. Maybe they need maintenance service.
Workaround¶
Even rerun can not help improve current situation.
Actions