Project

General

Profile

Actions

action #59855

closed

openqaworker-arm-1 seems to be under serious distress "kernel:[93903.692361] BUG: workqueue lockup - pool cpus=32 node=0 flags=0x0 nice=0 stuck for 42657s!"

Added by okurz about 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Start date:
2019-11-14
Due date:
2019-11-21
% Done:

0%

Estimated time:

Description

Observation

E.g. see https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/139724 :

openqaworker-arm-1.suse.de:
    Minion did not return. [No response]

On the machine in a root ssh session:

Message from syslogd@openqaworker-arm-1 at Nov 14 19:00:26 ...
 kernel:[108858.247643] BUG: workqueue lockup - pool cpus=32 node=0 flags=0x0 nice=0 stuck for 57612s!

Message from syslogd@openqaworker-arm-1 at Nov 14 19:00:56 ...
 kernel:[108888.966978] BUG: workqueue lockup - pool cpus=32 node=0 flags=0x0 nice=0 stuck for 57642s!

Message from syslogd@openqaworker-arm-1 at Nov 14 19:01:27 ...
 kernel:[108919.696316] BUG: workqueue lockup - pool cpus=32 node=0 flags=0x0 nice=0 stuck for 57673s!
…

A lot of IO stalled processes and a high load:

# cat /proc/loadavg 
35.53 35.65 35.62 4/877 27365
# ps -weo stat,pid,wchan:32,args | grep '^D\>'
D      362 rcu_exp_wait_wake                [kworker/6:1]
D     4189 io_schedule                      [kworker/u96:9]
D     8636 io_schedule                      /usr/bin/perl /usr/share/openqa/script/openqa-workercache minion worker -m production
D     8893 io_schedule                      /usr/bin/perl /usr/share/openqa/script/openqa-workercache minion worker -m production
D     9272 io_schedule                      /usr/bin/perl /usr/share/openqa/script/openqa-workercache minion worker -m production
D     9654 io_schedule                      /usr/bin/perl /usr/share/openqa/script/openqa-workercache minion worker -m production
D    14271 flush_work                       /usr/bin/gpg2 --version
Actions

Also available in: Atom PDF