Project

General

Profile

Actions

action #23536

closed

[tools] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. appreas regularly in openQA logs

Added by nicksinger over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
2017-08-11
Due date:
% Done:

0%

Estimated time:

Description

Since (rough estimation) the heavy modification of the scheduler we can regularly observe the following error appear in the openQA log files:

[Wed Aug 23 09:56:24 2017] [11197:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

Context from the log file:

[Wed Aug 23 09:56:21 2017] [websockets:error] Worker not found for given connection during connection close
[Wed Aug 23 09:56:22 2017] [3069:info] Stopping worker 16795 gracefully (800 seconds)
[Wed Aug 23 09:56:22 2017] [23576:info] Worker 23576 started
[Wed Aug 23 09:56:22 2017] [23576:info] Connecting to AMQP server
[Wed Aug 23 09:56:22 2017] [3069:info] Worker 16795 stopped
[Wed Aug 23 09:56:22 2017] [23576:info] AMQP connection established
[Wed Aug 23 09:56:24 2017] [3069:info] Stopping worker 16889 gracefully (800 seconds)
[Wed Aug 23 09:56:24 2017] [23578:info] Worker 23578 started
[Wed Aug 23 09:56:24 2017] [23578:info] Connecting to AMQP server
[Wed Aug 23 09:56:24 2017] [23578:info] AMQP connection established
[Wed Aug 23 09:56:24 2017] [11197:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[Wed Aug 23 09:56:24 2017] [18942:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[Wed Aug 23 09:56:24 2017] [13669:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[Wed Aug 23 09:56:24 2017] [22897:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[Wed Aug 23 09:56:25 2017] [7513:debug] removing screenshot 4fb/518/987743821945823012420a62bd.png
[Wed Aug 23 09:56:25 2017] [7513:debug] removing screenshot 83c/f60/e85ad7da4f25b1eb96f0680aa9.png
[Wed Aug 23 09:56:25 2017] [7513:debug] removing screenshot 581/fd4/bdf3965a15065f31523cef9463.png
[Wed Aug 23 09:56:25 2017] [7513:debug] removing screenshot 7ad/564/ae42ed15f1806c65f71cbfd4f9.png
[Wed Aug 23 09:56:25 2017] [7513:debug] removing screenshot 898/5a6/6807e3c39b97dd10458dc2b70d.png
[Wed Aug 23 09:56:25 2017] [3069:info] Stopping worker 5110 gracefully (800 seconds)
[Wed Aug 23 09:56:25 2017] [3069:info] Worker 5110 stopped
[Wed Aug 23 09:56:25 2017] [23579:info] Worker 23579 started
[Wed Aug 23 09:56:25 2017] [23579:info] Connecting to AMQP server

Everything related to one of the workers who raised this message:

[Wed Aug 23 08:57:57 2017] [11197:info] Worker 11197 started
[Wed Aug 23 08:57:57 2017] [11197:info] Connecting to AMQP server
[Wed Aug 23 08:57:57 2017] [11197:info] AMQP connection established
[Wed Aug 23 09:13:00 2017] [7513:debug] removing screenshot a4f/fc9/e72271244111977be85ad8dcc1.png
[Wed Aug 23 09:53:30 2017] [11197:info] Got status update for job 1125270 that does not belong to Worker 543
[Wed Aug 23 09:56:24 2017] [11197:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[Wed Aug 23 09:56:39 2017] [3069:info] Stopping worker 11197 gracefully (800 seconds)
[Wed Aug 23 09:56:39 2017] [3069:info] Worker 11197 stopped

Unfortunately I cannot see what exactly is causing the issue here.

Suggestions on how to improve this message:

  • If possible, include more specific reasons for this (what is the context of the message? What did the worker try to do?)

If this message is critical (not self recovering):

  • Add hints where an admin could look for more information
  • Expand message to explain the admin: "Hey, something just broke - you need to interact"

If this message should just inform the admin:

  • Decrease log level to at max "warn"
Actions

Also available in: Atom PDF