action #40793
Updated by mkittler over 5 years ago
### Symptom analysis `33-developer-mode.t` is failing often on Travis with the following error message: ``` not ok 2 - web socket connection closed prematurely, was waiting for resume # Failed test 'web socket connection closed prematurely, was waiting for resume' # at /opt/testing_area/openqa/t/lib/OpenQA/Test/FullstackUtils.pm line 128. ``` On the developer console the following lines are indeed missing in the error case (compared to a good run and also what I would expect): ``` {"data":{"json_cmd_token":"GxmYxieq","resume_test_execution":null},"type":"info","what":"cmdsrvmsg"} {"data":{"reload_needles":{"cmd":"backend_reload_needles","json_cmd_token":"mzZzjVLv"}},"type":"info","what":"cmdsrvmsg"} {"data":{"check_screen":"on_prompt"},"type":"info","what":"cmdsrvmsg"} ``` In the `autoinst-log.txt` - even in the error case - we get 3 times `isotovideo: test execution will be resumed` and the test actually resumes. So the client definitely sends the message to resume correctly and the backend/isotovideo reacts accordingly. Only propagating this information back to the command server seems to fail occasionally. Hence there are also the following lines in the `autoinst-log.txt` missing in the error case: ``` [2018-09-10T15:04:59.0788 CEST] [debug] BYTES {"resume_test_execution":null,"json_cmd_token":"IetuXGbf"} [2018-09-10T15:04:59.0789 CEST] [debug] BYTES {"reload_needles":{"cmd":"backend_reload_needles","json_cmd_token":"qyQbPVBm"},"json_cmd_token":"BCZHrjDC"} ``` Those lines should be printed by the command server after propagating the commands to all clients. ### Deduction It seems that the IPC between isotovideo and the command server doesn't work reliably. ### Tasks 0. Improve log messages and error handling (eg. `BYTES ...` is not a meaningful context for the JSON document) 1. Find the conditions for "occasionally", try to reproduce locally (if possible) 2. Fix the IPC