Actions
action #23536
closed[tools] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. appreas regularly in openQA logs
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
2017-08-11
Due date:
% Done:
0%
Estimated time:
Description
Since (rough estimation) the heavy modification of the scheduler we can regularly observe the following error appear in the openQA log files:
[Wed Aug 23 09:56:24 2017] [11197:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
Context from the log file:
[Wed Aug 23 09:56:21 2017] [websockets:error] Worker not found for given connection during connection close
[Wed Aug 23 09:56:22 2017] [3069:info] Stopping worker 16795 gracefully (800 seconds)
[Wed Aug 23 09:56:22 2017] [23576:info] Worker 23576 started
[Wed Aug 23 09:56:22 2017] [23576:info] Connecting to AMQP server
[Wed Aug 23 09:56:22 2017] [3069:info] Worker 16795 stopped
[Wed Aug 23 09:56:22 2017] [23576:info] AMQP connection established
[Wed Aug 23 09:56:24 2017] [3069:info] Stopping worker 16889 gracefully (800 seconds)
[Wed Aug 23 09:56:24 2017] [23578:info] Worker 23578 started
[Wed Aug 23 09:56:24 2017] [23578:info] Connecting to AMQP server
[Wed Aug 23 09:56:24 2017] [23578:info] AMQP connection established
[Wed Aug 23 09:56:24 2017] [11197:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[Wed Aug 23 09:56:24 2017] [18942:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[Wed Aug 23 09:56:24 2017] [13669:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[Wed Aug 23 09:56:24 2017] [22897:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[Wed Aug 23 09:56:25 2017] [7513:debug] removing screenshot 4fb/518/987743821945823012420a62bd.png
[Wed Aug 23 09:56:25 2017] [7513:debug] removing screenshot 83c/f60/e85ad7da4f25b1eb96f0680aa9.png
[Wed Aug 23 09:56:25 2017] [7513:debug] removing screenshot 581/fd4/bdf3965a15065f31523cef9463.png
[Wed Aug 23 09:56:25 2017] [7513:debug] removing screenshot 7ad/564/ae42ed15f1806c65f71cbfd4f9.png
[Wed Aug 23 09:56:25 2017] [7513:debug] removing screenshot 898/5a6/6807e3c39b97dd10458dc2b70d.png
[Wed Aug 23 09:56:25 2017] [3069:info] Stopping worker 5110 gracefully (800 seconds)
[Wed Aug 23 09:56:25 2017] [3069:info] Worker 5110 stopped
[Wed Aug 23 09:56:25 2017] [23579:info] Worker 23579 started
[Wed Aug 23 09:56:25 2017] [23579:info] Connecting to AMQP server
Everything related to one of the workers who raised this message:
[Wed Aug 23 08:57:57 2017] [11197:info] Worker 11197 started
[Wed Aug 23 08:57:57 2017] [11197:info] Connecting to AMQP server
[Wed Aug 23 08:57:57 2017] [11197:info] AMQP connection established
[Wed Aug 23 09:13:00 2017] [7513:debug] removing screenshot a4f/fc9/e72271244111977be85ad8dcc1.png
[Wed Aug 23 09:53:30 2017] [11197:info] Got status update for job 1125270 that does not belong to Worker 543
[Wed Aug 23 09:56:24 2017] [11197:error] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[Wed Aug 23 09:56:39 2017] [3069:info] Stopping worker 11197 gracefully (800 seconds)
[Wed Aug 23 09:56:39 2017] [3069:info] Worker 11197 stopped
Unfortunately I cannot see what exactly is causing the issue here.
Suggestions on how to improve this message:¶
- If possible, include more specific reasons for this (what is the context of the message? What did the worker try to do?)
If this message is critical (not self recovering):
- Add hints where an admin could look for more information
- Expand message to explain the admin: "Hey, something just broke - you need to interact"
If this message should just inform the admin:
- Decrease log level to at max "warn"
Actions