Project

General

Profile

action #53603

unstable test "t/33-developer_mode.t"

Added by okurz about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Concrete Bugs
Target version:
Start date:
2019-07-01
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

See https://travis-ci.org/os-autoinst/openQA/jobs/552644731

+ perl t/33-developer_mode.t
ok 1
[…]
    not ok 2 - web socket connection closed prematurely, was waiting for response to set_pause_on_screen_mismatch
    #   Failed test 'web socket connection closed prematurely, was waiting for response to set_pause_on_screen_mismatch'
    #   at /opt/testing_area/openqa/t/lib/OpenQA/Test/FullstackUtils.pm line 162.
    # developer console contained at this point:
    # Connecting to ws:/localhost:44638/liveviewhandler/tests/1/developer/ws-proxy
    # Connection opened
    # {"data":null,"type":"info","what":"connecting to os-autoinst command server at ws:\/\/127.0.0.1:20013\/Xa23gr5SCJ27MUT0\/ws"}
    # {"data":null,"type":"info","what":"connected to os-autoinst command server"}
    # {"data":{"current_api_function":"assert_screen","current_test_full_name":"tinycore\/tests-boot","devel_mode_major_version":1,"devel_mode_minor_version":1,"developer_id":2,"developer_name":"Demo","developer_session_started_at":"2019-07-01T10:26:47","developer_session_tab_count":1,"outstanding_files":null,"outstanding_images":null,"pause_on_next_command":0,"pause_on_screen_mismatch":false,"pause_test_name":null,"running":"boot","tags":["core"],"test_execution_paused":0,"upload_up_to_current_module":null},"type":"info","what":"cmdsrvmsg"}
    # {"data":{"check_screen":{"check":0,"mustmatch":"core","timeout":60},"current_api_function":"assert_screen"},"type":"info","what":"cmdsrvmsg"}
    # {"data":{"check_screen":{"check":0,"mustmatch":"core","timeout":30},"current_api_function":"assert_screen"},"type":"info","what":"cmdsrvmsg"}
    # {"data":{"current_api_function":"send_key","send_key":{"cmd":"backend_send_key","json_cmd_token":"XdQxfOcV","key":"ret"}},"type":"info","what":"cmdsrvmsg"}
    # {"data":{"check_screen":{"check":0,"mustmatch":"on_prompt","timeout":600},"current_api_function":"assert_screen"},"type":"info","what":"cmdsrvmsg"}
    # {"data":{"current_api_function":"assert_screen","current_test_full_name":"tinycore\/tests-boot","devel_mode_major_version":1,"devel_mode_minor_version":1,"developer_id":2,"developer_name":"Demo","developer_session_started_at":"2019-07-01T10:26:47","developer_session_tab_count":1,"outstanding_files":null,"outstanding_images":null,"pause_on_next_command":0,"pause_on_screen_mismatch":false,"pause_test_name":null,"running":"boot","tags":["on_prompt"],"test_execution_paused":0,"upload_up_to_current_module":null},"type":"info","what":"cmdsrvmsg"}
    # {"data":{"ret":1},"type":"info","what":"cmdsrvmsg"}
    # {"data":{"current_api_function":"assert_screen","current_test_full_name":"tinycore\/tests-boot","devel_mode_major_version":1,"devel_mode_minor_version":1,"developer_id":2,"developer_name":"Demo","developer_session_started_at":"2019-07-01T10:26:47","developer_session_tab_count":1,"outstanding_files":null,"outstanding_images":null,"pause_on_next_command":0,"pause_on_screen_mismatch":"assert_screen","pause_test_name":null,"running":"boot","tags":["on_prompt"],"test_execution_paused":0,"upload_up_to_current_module":null},"type":"info","what":"cmdsrvmsg"}
    # {"data":{"outstanding_files":0,"outstanding_images":0,"upload_up_to":null,"upload_up_to_current_module":null},"type":"info","what":"upload progress"}
    # {"data":{"category":"cmdsrv-connection","code":1006,"reason":null},"type":"error","what":"connection to os-autoinst command server lost"}
    # Connection closed, trying to reconnect in 2 seconds
    # waiting for response to set_assert_screen_timeout, developer console contains:
    # Connecting to ws://127.0.0.1:20013/Xa23gr5SCJ27MUT0/ws
    # Connection opened
    # {"current_api_function":"assert_screen","current_test_full_name":"tinycore\/tests-boot","devel_mode_major_version":1,"devel_mode_minor_version":1,"pause_on_next_command":0,"pause_on_screen_mismatch":"assert_screen","pause_test_name":null,"running":"boot","tags":["on_prompt"],"test_execution_paused":0}
    # waiting for response to set_assert_screen_timeout, developer console contains:
    # Connecting to ws://127.0.0.1:20013/Xa23gr5SCJ27MUT0/ws
    # Connection opened
    # {"current_api_function":"assert_screen","current_test_full_name":"tinycore\/tests-boot","devel_mode_major_version":1,"devel_mode_minor_version":1,"pause_on_next_command":0,"pause_on_screen_mismatch":"assert_screen","pause_test_name":null,"running":"boot","tags":["on_prompt"],"test_execution_paused":0}
    # {"paused":{"check":0,"cmd":"report_timeout","json_cmd_token":"cFtLjWbS","msg":"match=on_prompt timed out after 600 (assert_screen)","tags":["on_prompt"]},"reason":"match=on_prompt timed out after 600 (assert_screen)"}

Related issues

Related to openQA Project - action #53594: travis CI tests fail on master in t/33-developer_mode.t "owned by Demo (2 minutes ago, developer has 3 tabs open)' doesn't match '(?^:owned by Demo.*2 tabs open)"Resolved2019-07-01

Related to openQA Project - action #55460: flaky test: t/33-developer_mode.t - not ok 308 - Wait for (?^:\"set_pause_on_screen_mismatch\":\"assert_screen\") timed outResolved2019-08-13

History

#1 Updated by mkittler about 1 year ago

  • Status changed from New to In Progress
  • Assignee set to mkittler
  • Target version set to Current Sprint

I'm adding this to the current sprint because I'm working on it anyways. This problem is just too annoying.

Since the ticket has been created I've already improved the test to handle disconnects better. But it is still not sufficient. It seems that the log still misses sometimes the response from isotovdeo for certain commands (most of the time for resume_test_execution).

The error handling of the liveview handler should be improved because sometimes the log also contains errors like which most likely come from the liveviewhandler:

    # <== {"data":{"outstanding_files":0,"outstanding_images":0,"upload_up_to":"shutdown","upload_up_to_current_module":1},"type":"info","what":"upload progress"}
String found where operator expected at template layouts/error.html.ep line 20, near "asset 'bootstrap.css'"
    (Do you need to predeclare asset?)
String found where operator expected at template layouts/error.html.ep line 21, near "asset 'bootstrap.js'"
    (Do you need to predeclare asset?)
Mojo::Reactor::Poll: I/O watcher failed: syntax error at template layouts/error.html.ep line 20, near "asset 'bootstrap.css'"
syntax error at template layouts/error.html.ep line 21, near "asset 'bootstrap.js'"
15:       % {
16:       <title><%= $appname %></title>
17:       % }
18: 
19:       <!-- Bootstrap core CSS -->
20:       %= asset 'bootstrap.css'
21:       %= asset 'bootstrap.js'
22: 
23:       %= content_for 'head'
24: 
25:       %= javascript begin
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Template.pm:166 (Mojo::Template)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Template.pm:177 (Mojo::Template)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/EPLRenderer.pm:41 (Mojolicious::Plugin::EPLRenderer)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/EPRenderer.pm:41 (Mojolicious::Plugin::EPRenderer)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Renderer.pm:227 (Mojolicious::Renderer)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Renderer.pm:112 (Mojolicious::Renderer)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Controller.pm:156 (Mojolicious::Controller)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Controller.pm:169 (Mojolicious::Controller)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm:126 (Mojolicious::Plugin::DefaultHelpers)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm:112 (Mojolicious::Plugin::DefaultHelpers)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugin/DefaultHelpers.pm:46 (Mojolicious::Plugin::DefaultHelpers)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Renderer.pm:70 (Mojolicious::Renderer)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious.pm:204 (Mojolicious)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugins.pm:15 (Mojolicious::Plugins)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious/Plugins.pm:18 (Mojolicious::Plugins)
/usr/lib/perl5/vendor_perl/5.18.2/Mojolicious.pm:147 (Mojolicious)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server.pm:68 (Mojo::Server)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/EventEmitter.pm:15 (Mojo::EventEmitter)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm:104 (Mojo::Server::Daemon)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/EventEmitter.pm:15 (Mojo::EventEmitter)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Transaction/HTTP.pm:60 (Mojo::Transaction::HTTP)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm:219 (Mojo::Server::Daemon)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm:200 (Mojo::Server::Daemon)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/EventEmitter.pm:15 (Mojo::EventEmitter)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/IOLoop/Stream.pm:103 (Mojo::IOLoop::Stream)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/IOLoop/Stream.pm:51 (Mojo::IOLoop::Stream)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Reactor/Poll.pm:143 (Mojo::Reactor::Poll)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Reactor/Poll.pm:143 (Mojo::Reactor::Poll)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Reactor/Poll.pm:58 (Mojo::Reactor::Poll)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Reactor/Poll.pm:99 (Mojo::Reactor::Poll)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/IOLoop.pm:134 (Mojo::IOLoop)
/usr/lib/perl5/vendor_perl/5.18.2/Mojo/Server/Daemon.pm:37 (Mojo::Server::Daemon)
/opt/testing_area/openqa/t/lib/OpenQA/Test/Utils.pm:261 (OpenQA::Test::Utils)
t/33-developer_mode.t:145 (main)

Strangely, sometimes the reponse is not missing but repeated twice:

    # ==> {"cmd":"resume_test_execution"}
    # <== {"data":{"ret":1},"type":"info","what":"cmdsrvmsg"}
    # <== {"data":{"resume_test_execution":null},"type":"info","what":"cmdsrvmsg"}
    # <== {"data":{"ret":1},"type":"info","what":"cmdsrvmsg"}
    # <== {"data":{"resume_test_execution":null},"type":"info","what":"cmdsrvmsg"}

#2 Updated by mkittler about 1 year ago

Example how it looks like when in a successful run (despite the connection being interrupted):

# Subtest: resume test execution and 2nd tab
    ok 1 - 2 browser tabs open (live view and tab from previous subtest)
    ok 2 - found connection opened at 135
    # waiting for resume, developer console contains:
    # status: Connecting to ws:/localhost:49280/liveviewhandler/tests/1/developer/ws-proxy
    # status: Using proxy: yes
    # status: Connection opened
    # <== {"data":null,"type":"info","what":"connecting to os-autoinst command server at ws:\/\/127.0.0.1:20013\/1wTUiBKtudqe3ExH\/ws"}
    # <== {"data":null,"type":"info","what":"reusing previous connection to os-autoinst command server at ws:\/\/127.0.0.1:20013\/1wTUiBKtudqe3ExH\/ws"}
    # <== {"data":{"outstanding_files":0,"outstanding_images":0,"upload_up_to":null,"upload_up_to_current_module":null},"type":"info","what":"upload progress"}
    # <== {"data":{"current_api_function":"last_screenshot_data","current_test_full_name":"tinycore\/tests-shutdown","devel_mode_major_version":1,"devel_mode_minor_version":1,"developer_id":2,"developer_name":"Demo","developer_session_started_at":"2019-07-25T14:13:43","developer_session_tab_count":2,"outstanding_files":0,"outstanding_images":0,"pause_on_next_command":0,"pause_on_screen_mismatch":"assert_screen","pause_test_name":null,"running":"shutdown","tags":null,"test_execution_paused":"reached module shutdown","upload_up_to_current_module":null},"type":"info","what":"cmdsrvmsg"}
    # <== {"data":{"current_api_function":"last_screenshot_data","current_test_full_name":"tinycore\/tests-shutdown","devel_mode_major_version":1,"devel_mode_minor_version":1,"developer_id":2,"developer_name":"Demo","developer_session_started_at":"2019-07-25T14:13:43","developer_session_tab_count":2,"outstanding_files":0,"outstanding_images":0,"pause_on_next_command":0,"pause_on_screen_mismatch":"assert_screen","pause_test_name":null,"running":"shutdown","tags":null,"test_execution_paused":"reached module shutdown","upload_up_to_current_module":null},"type":"info","what":"cmdsrvmsg"}
    # status: Can't send command, no ws connection opened! Will try to send when connection has been restored.
    # waiting for resume, developer console contains:
    # status: Connecting to ws:/localhost:49280/liveviewhandler/tests/1/developer/ws-proxy
    # status: Using proxy: yes
    # status: Connection opened
    # <== {"data":null,"type":"info","what":"connecting to os-autoinst command server at ws:\/\/127.0.0.1:20013\/1wTUiBKtudqe3ExH\/ws"}
    # <== {"data":null,"type":"info","what":"reusing previous connection to os-autoinst command server at ws:\/\/127.0.0.1:20013\/1wTUiBKtudqe3ExH\/ws"}
    # <== {"data":{"outstanding_files":0,"outstanding_images":0,"upload_up_to":null,"upload_up_to_current_module":null},"type":"info","what":"upload progress"}
    # <== {"data":{"current_api_function":"last_screenshot_data","current_test_full_name":"tinycore\/tests-shutdown","devel_mode_major_version":1,"devel_mode_minor_version":1,"developer_id":2,"developer_name":"Demo","developer_session_started_at":"2019-07-25T14:13:43","developer_session_tab_count":2,"outstanding_files":0,"outstanding_images":0,"pause_on_next_command":0,"pause_on_screen_mismatch":"assert_screen","pause_test_name":null,"running":"shutdown","tags":null,"test_execution_paused":"reached module shutdown","upload_up_to_current_module":null},"type":"info","what":"cmdsrvmsg"}
    # <== {"data":{"current_api_function":"last_screenshot_data","current_test_full_name":"tinycore\/tests-shutdown","devel_mode_major_version":1,"devel_mode_minor_version":1,"developer_id":2,"developer_name":"Demo","developer_session_started_at":"2019-07-25T14:13:43","developer_session_tab_count":2,"outstanding_files":0,"outstanding_images":0,"pause_on_next_command":0,"pause_on_screen_mismatch":"assert_screen","pause_test_name":null,"running":"shutdown","tags":null,"test_execution_paused":"reached module shutdown","upload_up_to_current_module":null},"type":"info","what":"cmdsrvmsg"}
    # status: Can't send command, no ws connection opened! Will try to send when connection has been restored.
    # <== {"data":{"outstanding_files":0,"outstanding_images":0,"upload_up_to":"shutdown","upload_up_to_current_module":1},"type":"info","what":"upload progress"}
    # waiting for resume, developer console contains:
    # status: Connecting to ws:/localhost:49280/liveviewhandler/tests/1/developer/ws-proxy
    # status: Using proxy: yes
    # status: Connection opened
    # <== {"data":null,"type":"info","what":"connecting to os-autoinst command server at ws:\/\/127.0.0.1:20013\/1wTUiBKtudqe3ExH\/ws"}
    # <== {"data":null,"type":"info","what":"reusing previous connection to os-autoinst command server at ws:\/\/127.0.0.1:20013\/1wTUiBKtudqe3ExH\/ws"}
    # <== {"data":{"outstanding_files":0,"outstanding_images":0,"upload_up_to":null,"upload_up_to_current_module":null},"type":"info","what":"upload progress"}
    # <== {"data":{"current_api_function":"last_screenshot_data","current_test_full_name":"tinycore\/tests-shutdown","devel_mode_major_version":1,"devel_mode_minor_version":1,"developer_id":2,"developer_name":"Demo","developer_session_started_at":"2019-07-25T14:13:43","developer_session_tab_count":2,"outstanding_files":0,"outstanding_images":0,"pause_on_next_command":0,"pause_on_screen_mismatch":"assert_screen","pause_test_name":null,"running":"shutdown","tags":null,"test_execution_paused":"reached module shutdown","upload_up_to_current_module":null},"type":"info","what":"cmdsrvmsg"}
    # <== {"data":{"current_api_function":"last_screenshot_data","current_test_full_name":"tinycore\/tests-shutdown","devel_mode_major_version":1,"devel_mode_minor_version":1,"developer_id":2,"developer_name":"Demo","developer_session_started_at":"2019-07-25T14:13:43","developer_session_tab_count":2,"outstanding_files":0,"outstanding_images":0,"pause_on_next_command":0,"pause_on_screen_mismatch":"assert_screen","pause_test_name":null,"running":"shutdown","tags":null,"test_execution_paused":"reached module shutdown","upload_up_to_current_module":null},"type":"info","what":"cmdsrvmsg"}
    # status: Can't send command, no ws connection opened! Will try to send when connection has been restored.
    # <== {"data":{"outstanding_files":0,"outstanding_images":0,"upload_up_to":"shutdown","upload_up_to_current_module":1},"type":"info","what":"upload progress"}
    # <== {"data":{"category":"cmdsrv-connection","code":1006,"reason":null},"type":"error","what":"connection to os-autoinst command server lost"}
    # <== {"data":{"category":"cmdsrv-connection","code":1006,"reason":null},"type":"error","what":"connection to os-autoinst command server lost"}
    # status: Connection error: error (check JavaScript console for details)
    # status: Connection closed, trying to reconnect in 2 seconds
    ok 3 - found resume at 4399
    ok 4 - found resume (2nd tab) at 9198
    1..4
ok 25 - resume test execution and 2nd tab

#3 Updated by mkittler about 1 year ago

The situation in the initial ticket description could be easily avoided by passing the current job ID to the web UI on worker registration if it is re-registering while a job is running.

Than this code which is executed on registration would be only conditionally executed:

    # in case the worker died ...
    # ... restart job assigned to this worker
    if (my $job = $worker->job) {
        $job->set_property('JOBTOKEN');
        $job->auto_duplicate;

        # .. set it incomplete
        $job->done(result => OpenQA::Jobs::Constants::INCOMPLETE);
        $worker->update({job_id => undef});
    }

So in case the worker still works on that job we wouldn't execute that if-block anymore. I guess that solves the problem without introducing any new ones like abandoned jobs stuck in running state forever (because if the worker really dies it would have lost the current job ID and we would still execute that if-block).

#4 Updated by mkittler about 1 year ago

  • Status changed from In Progress to Feedback

With my latest adjustments it seems stable. Let's see whether this turns out to be true in the long term.

#5 Updated by mkittler about 1 year ago

  • Status changed from Feedback to Resolved

I actually had not a single test failure anymore in my PRs. So I'm marking it as resolved (for now).

#6 Updated by okurz about 1 year ago

  • Related to action #53594: travis CI tests fail on master in t/33-developer_mode.t "owned by Demo (2 minutes ago, developer has 3 tabs open)' doesn't match '(?^:owned by Demo.*2 tabs open)" added

#7 Updated by okurz about 1 year ago

  • Related to action #55460: flaky test: t/33-developer_mode.t - not ok 308 - Wait for (?^:\"set_pause_on_screen_mismatch\":\"assert_screen\") timed out added

#8 Updated by mkittler about 1 year ago

  • Target version changed from Current Sprint to Done

Also available in: Atom PDF