action #180989
open[openQA][worker][ipmi] It takes an extremely long time to finish the job when connected to an ipmi worker size:S
0%
Description
Observation¶
Recently I encountered the same issue many times, namely it takes extremely long time to release certain worker, for example, grenache-1:14.
Initially I scheduled two test runs on the same worker to verify another issue, but I waited quite long time after the first job finished and before the second job started running .
Worker grenache-1:14 took more than 6.5 hours to finish its work on the first job modules of which actually only took less than 4 hours. This means it took another 2.5 hours for the worker grenache-1:14 to be released before taking new job. See this example job.
I know worker may need to do some post-testrun work, upload logs and clean itself up before taking new job. But 2.5 hours is still too long for this case. It seems that there is something wrong with it. I personally did not notice the same issue with other workers till now.
Steps to reproduce¶
- Schedule two jobs on grenache-1:14
- Wait for the second job to be picked up after the first one finishes.
Impact¶
It might not have big impact on large-scale test run overall, but it does affect my verification progress.
Problem¶
Looks like the worker lingers on some unnoticed work after finishing its work on a job.
Suggestions¶
- Check worker process/log on worker machine
- Do maintenance or cleanup if necessary
Workaround¶
n/a
Updated by livdywan about 12 hours ago
- Subject changed from [openQA][worker][ipmi] It takes extremely long time to release certain ipmi worker to [openQA][worker][ipmi] It takes an extremely long time to finish the job when connected to an ipmi worker size:S
- Status changed from New to Workable