Project

General

Profile

Actions

action #180989

open

[openQA][worker][ipmi] It takes an extremely long time to finish the job when connected to an ipmi worker size:S

Added by waynechen55 8 days ago. Updated about 12 hours ago.

Status:
Workable
Priority:
High
Assignee:
-
Category:
Regressions/Crashes
Start date:
2025-04-15
Due date:
% Done:

0%

Estimated time:

Description

Observation

Recently I encountered the same issue many times, namely it takes extremely long time to release certain worker, for example, grenache-1:14.

Initially I scheduled two test runs on the same worker to verify another issue, but I waited quite long time after the first job finished and before the second job started running .

Worker grenache-1:14 took more than 6.5 hours to finish its work on the first job modules of which actually only took less than 4 hours. This means it took another 2.5 hours for the worker grenache-1:14 to be released before taking new job. See this example job.

I know worker may need to do some post-testrun work, upload logs and clean itself up before taking new job. But 2.5 hours is still too long for this case. It seems that there is something wrong with it. I personally did not notice the same issue with other workers till now.

Steps to reproduce

  • Schedule two jobs on grenache-1:14
  • Wait for the second job to be picked up after the first one finishes.

Impact

It might not have big impact on large-scale test run overall, but it does affect my verification progress.

Problem

Looks like the worker lingers on some unnoticed work after finishing its work on a job.

Suggestions

  • Check worker process/log on worker machine
  • Do maintenance or cleanup if necessary

Workaround

n/a

Actions #1

Updated by waynechen55 8 days ago

  • Description updated (diff)
Actions #2

Updated by livdywan 7 days ago

  • Tags set to infra, reactive work
  • Category set to Regressions/Crashes
  • Priority changed from Normal to High
  • Target version set to Ready

Probably good to check soon as this would slow all jobs by the sounds of it.

Actions #3

Updated by livdywan about 12 hours ago

  • Subject changed from [openQA][worker][ipmi] It takes extremely long time to release certain ipmi worker to [openQA][worker][ipmi] It takes an extremely long time to finish the job when connected to an ipmi worker size:S
  • Status changed from New to Workable
Actions

Also available in: Atom PDF