action #180989: [openQA][worker][ipmi] It takes an extremely long time to finish the job when connected to an ipmi worker size:S - openQA Infrastructure (public) - openSUSE Project Management Tool

Actions

Copy link

action #180989

open

[openQA][worker][ipmi] It takes an extremely long time to finish the job when connected to an ipmi worker size:S

Added by waynechen55 8 days ago. Updated about 12 hours ago.

Status:

Workable

Priority:

High

Assignee:

Category:

Regressions/Crashes

Target version:

openQA Project (public) - Ready

Start date:

2025-04-15

Due date:

% Done:

Estimated time:

Tags:

infra, reactive work

Description

Observation¶

Recently I encountered the same issue many times, namely it takes extremely long time to release certain worker, for example, grenache-1:14.

Initially I scheduled two test runs on the same worker to verify another issue, but I waited quite long time after the first job finished and before the second job started running .

Worker grenache-1:14 took more than 6.5 hours to finish its work on the first job modules of which actually only took less than 4 hours. This means it took another 2.5 hours for the worker grenache-1:14 to be released before taking new job. See this example job.

I know worker may need to do some post-testrun work, upload logs and clean itself up before taking new job. But 2.5 hours is still too long for this case. It seems that there is something wrong with it. I personally did not notice the same issue with other workers till now.

Steps to reproduce¶

Schedule two jobs on grenache-1:14
Wait for the second job to be picked up after the first one finishes.

Impact¶

It might not have big impact on large-scale test run overall, but it does affect my verification progress.

Problem¶

Looks like the worker lingers on some unnoticed work after finishing its work on a job.

Suggestions¶

Check worker process/log on worker machine
Do maintenance or cleanup if necessary

Workaround¶

n/a

Actions

Copy link

Updated by waynechen55 8 days ago

Description updated (diff)

Actions

Copy link

Updated by livdywan 7 days ago

Tags set to infra, reactive work
Category set to Regressions/Crashes
Priority changed from Normal to High
Target version set to Ready

Probably good to check soon as this would slow all jobs by the sounds of it.

Actions

Copy link

Updated by livdywan about 12 hours ago

Subject changed from [openQA][worker][ipmi] It takes extremely long time to release certain ipmi worker to [openQA][worker][ipmi] It takes an extremely long time to finish the job when connected to an ipmi worker size:S
Status changed from New to Workable

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public) » openQA Infrastructure (public)

Tags

Custom queries

action #180989

[openQA][worker][ipmi] It takes an extremely long time to finish the job when connected to an ipmi worker size:S

Observation¶

Steps to reproduce¶

Impact¶

Problem¶

Suggestions¶

Workaround¶

Updated by waynechen55 8 days ago

Updated by livdywan 7 days ago

Updated by livdywan about 12 hours ago