action #19652

devel:openQA/openQA runs no tests on Tumbleweed

Added by coolo almost 3 years ago. Updated 5 months ago.

Status:ResolvedStart date:07/06/2017
Priority:NormalDue date:
Assignee:tinita% Done:

100%

Category:Concrete Bugs
Target version:Current Sprint
Difficulty:
Duration:

Description

12-needle-edit.t fails and hangs on Tumbleweed. This needs to be debugged

History

#1 Updated by mkittler almost 3 years ago

  • Status changed from New to In Progress
  • Assignee set to mkittler

#2 Updated by mkittler almost 3 years ago

I can reproduce locally, the test ./t/ui/12-needle-edit.t hangs.

The test fails with the error message

[  181s] [21661:debug] 200 OK (0.019245s, 51.962/s)
[  182s] Uncaught exception from user code:
[  182s]        Error while executing command: clickElement: Server returned error message Server closed connection without sending any data back at /usr/lib/perl5/vendor_perl/5.24.1/Net/HTTP/Methods.pm line 391.
[  182s]         instead of data at /usr/lib/perl5/vendor_perl/5.24.1/Selenium/Remote/Driver.pm line 327.
[  182s]         at /usr/lib/perl5/vendor_perl/5.24.1/Selenium/Remote/Driver.pm line 327.
[  182s]        Selenium::Remote::Driver::catch {...} ("Error while executing command: clickElement: Server returned "...) called at /usr/lib/perl5/vendor_perl/5.24.1/Try/Tiny.pm line 124
[  182s]        Try::Tiny::try(CODE(0x5601b5a76208), Try::Tiny::Catch=REF(0x5601b5a7a478)) called at /usr/lib/perl5/vendor_perl/5.24.1/Selenium/Remote/Driver.pm line 327
[  182s]        Selenium::Remote::Driver::__ANON__(CODE(0x5601b522f828), Test::Selenium::PhantomJS=HASH(0x5601b12f6c20), HASH(0x5601b5a7a4f0)) called at (eval 1324) line 1
[  182s]        Selenium::Remote::Driver::__ANON__(Test::Selenium::PhantomJS=HASH(0x5601b12f6c20), HASH(0x5601b5a7a4f0)) called at (eval 1326) line 2
[  182s]        Selenium::Remote::Driver::_execute_command(Test::Selenium::PhantomJS=HASH(0x5601b12f6c20), HASH(0x5601b5a7a4f0)) called at (eval 1287) line 17
[  182s]        Selenium::Remote::WebElement::_execute_command(Test::Selenium::Remote::WebElement=HASH(0x5601b5a764d8), HASH(0x5601b5a7a4f0)) called at /usr/lib/perl5/vendor_perl/5.24.1/Selenium/Remote/WebElement.pm line 49
[  182s]        Selenium::Remote::WebElement::click(Test::Selenium::Remote::WebElement=HASH(0x5601b5a764d8)) called at ./t/ui/12-needle-edit.t line 228
[  182s]        main::overwrite_needle("test-newneedle") called at ./t/ui/12-needle-edit.t line 312
[  182s] Error while executing command: quit: Server returned error message Can't connect to 127.0.0.1:8910
[  182s] 
[  182s] Connection refused at /usr/lib/perl5/vendor_perl/5.24.1/LWP/Protocol/http.pm line 46.
[  182s]  instead of data at /usr/lib/perl5/vendor_perl/5.24.1/Selenium/Remote/Driver.pm line 327.
[  182s]  at /usr/lib/perl5/vendor_perl/5.24.1/Selenium/Remote/Driver.pm line 327.
[  182s] END failed--call queue aborted.
[  182s] # Tests were run but no plan was declared and done_testing() was not seen.
[  182s] # Looks like your test exited with 111 just after 40.

and then the build process just hangs forever leaving a zombi perl process (spawned by prove). There is also another perl process running:

Thread 1 (Thread 0x7fb9fc147540 (LWP 15123)):
#0  0x00007fb9fb5fcad0 in __read_nocancel () from /lib64/libpthread.so.0
#1  0x0000555ddbe0bf7d in PerlIOUnix_read ()
#2  0x0000555ddbe0f3e8 in PerlIOBuf_fill ()
#3  0x0000555ddbe0dad8 in Perl_PerlIO_fill ()
#4  0x0000555ddbe0f2e0 in PerlIOBase_read ()
#5  0x0000555ddbe112b8 in PerlIO_getc ()
#6  0x0000555ddbdab221 in Perl_sv_gets ()
#7  0x0000555ddbd892ed in Perl_do_readline ()
#8  0x0000555ddbd849c6 in Perl_runops_standard ()
#9  0x0000555ddbd0af87 in perl_run ()
#10 0x0000555ddbce4282 in main ()

Actually, I can reproduce this by just running directly OPENQA_BASEDIR= SELENIUM_CHROME= prove -v t/ui/12-needle-edit.t (not via osc build). It also hangs (instead of terminating with error). When starting the test directly via perl it fails as well, but it doesn't hang. When using Chromium instead of PhantomJS the test does not fail at all.

So there are actually 2 issues:
1. The UI test t/ui/12-needle-edit.t fails.
2. The test just hangs when executed via prove (instead of terminating with error) causing the entire build process to hang.

#3 Updated by mkittler almost 3 years ago

I still don't understand the source of the issue but it is likely a bug in PhantomJS triggered by $driver->find_element_by_id('save')->click(); in line 228. The button can be located, but the click leads to the error. The error even occurs when simulating the button click via JavaScript ($driver->execute_script(...)).


However, Executing saveNeedle() function directly works. So this would be a workaround for issue 1: https://github.com/os-autoinst/openQA/pull/1370

Issue 2 could be solved by either not using prove or executing the tests with timeout.

#4 Updated by mkittler almost 3 years ago

  • Status changed from In Progress to Feedback

Hard to tell what exactly is the source of the problem. I could work around the issue by disabling some parts of the test when using phantomjs.

Another solution might be upgrading the version of Qt WebKit phantomjs uses. It is still using old Qt WebKit from https://github.com/Vitallium/qtwebkit/commits/phantomjs. Switching to more recent Qt WebKit-NG (https://github.com/annulen/webkit) might fix the issue. It is supposed to be a drop-in replacement for the old Qt WebKit, but it likely requires some adjustments to work with phantomjs, though. Not sure whether it is worth the effort and maybe it doesn't even help.

Why not just using chrome driver when making the package? It would work at least and phantomjs is still tested via CI.

Note that in the long run we must drop support for phantomjs anyways in case phantomjs devs don't update to newer WebKit because support for new web technologies will be missing.

#5 Updated by mkittler almost 3 years ago

I've got response from a PhantomJS developer. They already use Qt WebKit-ng on the latest master and the next release (2.5) will have it, too. So PhantomJS will get a recent web engine in the future. This change might also fix the issue in the openQA tests.

Till then we could try to use chromedriver when packaging (as already suggested) or simply omit the problematic test.

#6 Updated by mkittler about 2 years ago

  • Status changed from Feedback to In Progress

@coolo declared phantomjs dead, so I suppose we don't have to wait for the next release and just stick with chromedriver.

However, the UI tests seem to be skipped currently for another reason:

[ 471s] ./t/ui/01-list.t .......................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 474s] ./t/ui/02-csrf.t .......................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 477s] ./t/ui/02-list-group.t .................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 480s] ./t/ui/03-source.t ........................ skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 483s] ./t/ui/04-api_keys.t ...................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 486s] ./t/ui/05-auth.t .......................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 489s] ./t/ui/06-operator_links.t ................ skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 492s] ./t/ui/07-file.t .......................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 495s] ./t/ui/09-admin_creation.t ................ skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 498s] ./t/ui/09-users-list.t .................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 501s] ./t/ui/10-tests_overview.t ................ skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 504s] ./t/ui/12-needle-edit.t ................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 507s] ./t/ui/13-admin.t ......................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 510s] ./t/ui/14-dashboard-parents.t ............. skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 512s] ./t/ui/14-dashboard.t ..................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 515s] ./t/ui/15-admin-workers.t ................. skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 518s] ./t/ui/15-comments.t ...................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 521s] ./t/ui/16-tests_previous_results.t ........ skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 524s] ./t/ui/17-product-log.t ................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 527s] ./t/ui/18-tests-details.t ................. skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 531s] ./t/ui/19-tests-links.t ................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 535s] ./t/ui/20-bugzilla-links.t ................ skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 539s] ./t/ui/21-admin-needles.t ................. skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 542s] ./t/ui/22-job_group_order.t ............... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 545s] ./t/ui/23-audit-log.t ..................... skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test
[ 549s] ./t/ui/24-feature-tour.t .................. skipped: set TEST_PG to e.g. DBI:Pg:dbname=test" to enable this test

(from https://build.opensuse.org/package/live_build_log/devel:openQA/openQA/openSUSE_Tumbleweed/x86_64)

#7 Updated by mkittler about 2 years ago

@coolo Seems like you disabled the code for setting the environment variable intentionally:

%if %{with tests}
#make test
rm -rf %{buildroot}/DB
#./t/test_postgresql %{buildroot}/DB
#export TEST_PG="DBI:Pg:dbname=openqa_test;host=%{buildroot}/DB"
OBS_RUN=1 prove -r -j2 -v
#pg_ctl -D %{buildroot}/DB stop
%endif

Are you aware that this prevents some of the tests to run?

#8 Updated by coolo about 2 years ago

random issues that need debugging

#9 Updated by coolo about 2 years ago

I soft enabled the tests now - but there are quite a bunch still failing.

e.g.
```
[ 19s] + export LC_ALL=en_US.UTF-8
[ 19s] + LC_ALL=en_US.UTF-8
[ 19s] + MOJO_LOG_LEVEL=debug
[ 19s] + OBS_RUN=1
[ 19s] + prove -v t/api/09-comments.t
[ 23s]
[ 23s] # Failed test 'comment can be created'
[ 23s] # at t/api/09-comments.t line 79.
[ 23s] # got: '500'
[ 23s] # expected: '200'
[ 23s] Use of uninitialized value $comment_id in concatenation (.) or string at
[ 23s] t/api/09-comments.t line 43 (#1)
23s An undefined value was used as if it were already
[ 23s] defined. It was interpreted as a "" or a 0, but maybe it was a mistake.
[ 23s] To suppress this warning assign a defined value to your variables.
[ 23s]

[ 23s] To help you figure out what was undefined, perl will try to tell you
[ 23s] the name of the variable (if any) that was undefined. In some cases
[ 23s] it cannot do this, so it also tells you what operation you used the
[ 23s] undefined value in. Note, however, that perl optimizes your program
[ 23s] and the operation displayed in the warning may not necessarily appear
[ 23s] literally in your program. For example, "that $foo" is usually
[ 23s] optimized into "that " . $foo, and the warning will refer to the
[ 23s] concatenation (.) operator, even though there is no . in
[ 23s] your program.
[ 23s]

[ 23s] Use of uninitialized value $comment_id in concatenation (.) or string at t/api/09-comments.t line 43.
[ 23s] # Looks like you failed 1 test of 3.
[ 23s]
[ 23s] # Failed test 'job comments'
[ 23s] # at t/api/09-comments.t line 153.
[ 23s] Not a HASH reference at t/api/09-comments.t line 44 (#2)
23s Perl was trying to evaluate a reference to a hash value, but found a
[ 23s] reference to something else instead. You can use the ref() function to
[ 23s] find out what kind of ref it really was. See perlref.
[ 23s]

[ 23s] Uncaught exception from user code:
[ 23s] Not a HASH reference at t/api/09-comments.t line 44.
[ 23s] Test::Builder::subtest(Test::Builder=HASH(0x55b23dbf29f8), "job comments", CODE(0x55b2448e4a38)) called at /usr/lib/perl5/5.26.1/Test/More.pm line 807
[ 23s] Test::More::subtest("job comments", CODE(0x55b2448e4a38)) called at t/api/09-comments.t line 153
[ 23s]
[ 23s] # Failed test 'no (unexpected) warnings (via END block)'
[ 23s] # at /usr/lib/perl5/5.26.1/Test/Builder.pm line 135.
[ 23s] # Tests were run but no plan was declared and done_testing() was not seen.
[ 23s] # Looks like your test exited with 25 just after 2.
[ 23s] t/api/09-comments.t ..
[ 23s] # Subtest: job comments
[ 23s] [debug] POST "/api/v1/jobs/99981/comments"
[ 23s] [debug] Routing to controller "OpenQA::WebAPI::Controller::API::V1" and action "auth"
[ 23s] [debug] API key from client: PERCIVALKEY02
[ 23s] [debug] Key is for user 'percival'
[ 23s] [debug] API auth by user: percival, operator: 1
[ 23s] [debug] Routing to controller "OpenQA::WebAPI::Controller::API::V1::Comment" and action "create"
[ 23s] [error] Wide character in subroutine entry at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1891.
[ 23s] [debug] 500 Internal Server Error (0.016989s, 58.862/s)
[ 23s] ok 1 - POST /api/v1/jobs/99981/comments
[ 23s] not ok 2 - comment can be created
[ 23s] # Subtest: get comment
[ 23s] [debug] GET "/api/v1/jobs/99981/comments/"
[ 23s] [debug] Routing to controller "OpenQA::WebAPI::Controller::API::V1::Comment" and action "list"
[ 23s] [debug] 200 OK (0.005509s, 181.521/s)
[ 23s] ok 1 - GET /api/v1/jobs/99981/comments/
[ 23s] 1..1
[ 23s] ok 3 - get comment
[ 23s] 1..3
[ 23s] not ok 1 - job comments
[ 23s] not ok 2 - no (unexpected) warnings (via END block)
[ 23s] Dubious, test returned 25 (wstat 6400, 0x1900)
[ 23s] Failed 2/2 subtests

#10 Updated by AdamWill about 2 years ago

FWIW, coolo disabled these tests in the OBS spec:

rm -f ./t/24-worker.t ./t/api/09-comments.t ./t/ui/07-file.t ./t/ui/13-admin.t ./t/ui/15-comments.t ./t/ui/18-tests-details.t

I just did a build of current git in Fedora package build env (koji). 24-worker.t and ui/07-file.t failed there too. api/09-comments.t passed. the ui tests are all skipped in our build as we don't have selenium available.

#11 Updated by coolo about 2 years ago

I removed that remove again - and these fail atm:

[ 1206s] Test Summary Report
[ 1206s] -------------------
[ 1206s] ./t/05-scheduler-dependencies.t (Wstat: 256 Tests: 199 Failed: 1)
[ 1206s] Failed test: 176
[ 1206s] Non-zero exit status: 1
[ 1206s] ./t/ui/07-file.t (Wstat: 1024 Tests: 60 Failed: 4)
[ 1206s] Failed tests: 28-29, 57-58
[ 1206s] Non-zero exit status: 4
[ 1206s] ./t/ui/13-admin.t (Wstat: 256 Tests: 15 Failed: 1)
[ 1206s] Failed test: 14
[ 1206s] Non-zero exit status: 1

There is a problem with one DVD, that isn't found in the assets.

#12 Updated by coolo about 2 years ago

[  153s] #   Failed test 'jobA2 has jobB2, jobC2 and jobD2 as children'
[  153s] #   at ./t/05-scheduler-dependencies.t line 716.
[  153s] #     Structures begin differing at:
[  153s] #          $got->[0] = '100030'
[  153s] #     $expected->[0] = '100032'
[  155s] # Looks like you failed 1 test of 199.
[  155s] ./t/05-scheduler-dependencies.t ........... 

[  740s] #   Failed test '200 OK'
[  740s] #   at ./t/ui/07-file.t line 66.
[  740s] #          got: '404'
[  740s] #     expected: '200'
[  740s] 
[  740s] #   Failed test 'Content-Disposition: attatchment; filename=openSUSE-13.1-DVD-i586-Build0091-Media.iso;'
[  740s] #   at ./t/ui/07-file.t line 66.
[  740s] #          got: undef
[  740s] #     expected: 'attatchment; filename=openSUSE-13.1-DVD-i586-Build0091-Media.iso;'
[  756s] # Premature connection close
[  756s] 
[  756s] #   Failed test 'GET /assets/iso/openSUSE-13.1-DVD-i586-Build0091-Media.iso'
[  756s] #   at ./t/ui/07-file.t line 100.
[  756s] 
[  756s] #   Failed test '200 OK'
[  756s] #   at ./t/ui/07-file.t line 100.
[  756s] #          got: undef
[  756s] #     expected: '200'

[  867s]     #   Failed test 'asset with unknown last use and size'
[  867s]     #   at ./t/ui/13-admin.t line 515.
[  867s]     #     Structures begin differing at:
[  867s]     #          $got->[0] = Does not exist
[  867s]     #     $expected->[0] = 'iso/openSUSE-13.1-DVD-i586-Build0091-Media.iso'
[  868s] 
[  868s]     #   Failed test 'groups of "assets by group"'
[  868s]     #   at ./t/ui/13-admin.t line 531.
[  868s]     #     Structures begin differing at:
[  868s]     #          $got->[0] = 'Untracked
[  868s]     #     40MiB / 0 GiB'
[  868s]     #     $expected->[0] = 'opensuse
[  868s]     #     16 Byte / 100 GiB'
[  868s] 
[  868s]     #   Failed test 'assets of "assets by group"'
[  868s]     #   at ./t/ui/13-admin.t line 537.
[  868s]     #     Structures begin differing at:
[  868s]     #          $got->[2] = 'hdd/fixed/openSUSE-13.1-x86_64.hda
[  868s]     #     4 Byte'
[  868s]     #     $expected->[2] = 'iso/openSUSE-13.1-DVD-i586-Build0091-Media.iso
[  868s]     #     4 Byte'
[  873s]     # Looks like you failed 3 tests of 13.
[  873s] 
[  873s] #   Failed test 'asset list'
[  873s] #   at ./t/ui/13-admin.t line 557.

#13 Updated by coolo about 2 years ago

  • Status changed from In Progress to New
  • Assignee deleted (mkittler)
  • Target version changed from Milestone 8 to Ready

the priority is actually lowish normal

#14 Updated by okurz almost 2 years ago

With https://build.opensuse.org/package/rdiff/devel:openQA/openQA?linkrev=base&rev=3530 a package self-test has been introduced.

%check within spec-file runs the tests but ignores them with || true and some tests fail so I guess still valid.

#15 Updated by tinita 5 months ago

  • Status changed from New to In Progress
  • Assignee set to tinita

#16 Updated by tinita 5 months ago

  • Target version changed from Ready to Current Sprint

#17 Updated by cdywan 5 months ago

  • Subject changed from devel:openQA/openQA runs no tests on TW to devel:openQA/openQA runs no tests on Tumbleweed

#18 Updated by tinita 5 months ago

PR https://github.com/os-autoinst/openQA/pull/2431

I'm running the OBS build a couple of times now to see if there are more flaky tests I need to exclude.

#19 Updated by tinita 5 months ago

PR was merged

#20 Updated by tinita 5 months ago

  • Status changed from In Progress to Feedback
  • % Done changed from 0 to 100

#21 Updated by tinita 5 months ago

  • Status changed from Feedback to Resolved

Also available in: Atom PDF