action #40793

Prevent isotovideo <=> command server IPC to occasionally lose messages

Added by mkittler over 1 year ago. Updated over 1 year ago.

Status:ResolvedStart date:10/09/2018
Priority:NormalDue date:
Assignee:mkittler% Done:

0%

Category:Feature requests
Target version:Done
Difficulty:
Duration:

Description

Symptom analysis

33-developer-mode.t is failing often on Travis with the following error message:

not ok 2 - web socket connection closed prematurely, was waiting for resume
# Failed test 'web socket connection closed prematurely, was waiting for resume'
# at /opt/testing_area/openqa/t/lib/OpenQA/Test/FullstackUtils.pm line 128.

On the developer console the following lines are indeed missing in the error case (compared to a good run and also what I would expect):

{"data":{"json_cmd_token":"GxmYxieq","resume_test_execution":null},"type":"info","what":"cmdsrvmsg"}
{"data":{"reload_needles":{"cmd":"backend_reload_needles","json_cmd_token":"mzZzjVLv"}},"type":"info","what":"cmdsrvmsg"}
{"data":{"check_screen":"on_prompt"},"type":"info","what":"cmdsrvmsg"}

In the autoinst-log.txt - even in the error case - we get 3 times isotovideo: test execution will be resumed and the test actually resumes. So the client definitely sends the message to resume correctly and the backend/isotovideo reacts accordingly. Only propagating this information back to the command server seems to fail occasionally. Hence there are also the following lines in the autoinst-log.txt missing in the error case:

[2018-09-10T15:04:59.0788 CEST] [debug] BYTES {"resume_test_execution":null,"json_cmd_token":"IetuXGbf"}
[2018-09-10T15:04:59.0789 CEST] [debug] BYTES {"reload_needles":{"cmd":"backend_reload_needles","json_cmd_token":"qyQbPVBm"},"json_cmd_token":"BCZHrjDC"}

Those lines should be printed by the command server after propagating the commands to all clients.

Deduction

It seems that the IPC between isotovideo and the command server doesn't work reliably.

Tasks

  1. Improve log messages and error handling (eg. BYTES ... is not a meaningful context for the JSON document)
  2. Find the conditions for "occasionally", try to reproduce locally (if possible)
  3. Fix the IPC

History

#1 Updated by mkittler over 1 year ago

  • Description updated (diff)

#2 Updated by mkittler over 1 year ago

  • Subject changed from Fix occational problems of isotovideo <=> command server IPC loosing messages to Prevent isotovideo <=> command server IPC to occasionally lose messages
  • Status changed from New to In Progress

#3 Updated by coolo over 1 year ago

  • Status changed from In Progress to Resolved
  • Target version set to Done

Also available in: Atom PDF