action #39515

videoencoder crashed on aarch64

Added by szarate over 1 year ago. Updated over 1 year ago.

Status:ResolvedStart date:09/08/2018
Priority:HighDue date:
Assignee:szarate% Done:

0%

Category:Concrete Bugs
Target version:Done
Difficulty:
Duration:

Description

This job on aarch64 seems to have crashed https://openqa.suse.de/tests/1913199/file/autoinst-log.txt

[2018-08-09T09:46:36.0891 UTC] [debug] WARNING: check_asserted_screen took 1.07 seconds - make your needles more specific
[2018-08-09T09:46:36.0892 UTC] [debug] DEBUG_IO: 
NAME                                               TIME        CUMULATIVE      PERCENTAGE
 Searching for needles                              0.000       0.000           0.011%
 **++ search__: get image                           0.000       0.000           0.013%
 **++ tinycv::search_needle 265x90 + 1024 @ 271x257 0.373       0.373           34.849%
 ** search_: rebootnow-20160504                     0.001       0.374           0.109%
 **++ search__: get image                           0.000       0.374           0.007%
 **++ tinycv::search_needle 265x90 + 1024 @ 268x232 0.361       0.735           33.734%
 ** search_: rebootnow-20150409                     0.001       0.737           0.125%
 **++ search__: get image                           0.000       0.737           0.013%
 **++ tinycv::search_needle 265x90 + 1024 @ 245x237 0.332       1.069           31.044%
 ** search_: rebootnow-20131217                     0.001       1.070           0.079%
 _stop_                                             0.000       1.070           0.015%

[2018-08-09T09:46:36.0907 UTC] [debug] no match: 2535.1s
[2018-08-09T09:46:37.0083 UTC] [debug] MATCH(rebootnow-20160504:0.52)
[2018-08-09T09:46:37.0127 UTC] [debug] MATCH(rebootnow-20150409:0.00)
[2018-08-09T09:46:37.0167 UTC] [debug] MATCH(rebootnow-20131217:0.00)
[2018-08-09T09:46:37.0168 UTC] [debug] no match: 2533.9s
[2018-08-09T09:46:38.0021 UTC] [debug] MATCH(rebootnow-20160504:0.52)
[2018-08-09T09:46:38.0065 UTC] [debug] MATCH(rebootnow-20150409:0.00)
[2018-08-09T09:46:38.0106 UTC] [debug] MATCH(rebootnow-20131217:0.00)
[2018-08-09T09:46:38.0108 UTC] [debug] no match: 2532.9s
[2018-08-09T09:46:38.0923 UTC] [debug] no change: 2532.0s
[2018-08-09T09:46:40.0316 UTC] [debug] WARNING: enqueue_screenshot took 1.38 seconds
[2018-08-09T09:46:40.0317 UTC] [debug] DEBUG_IO: 
NAME                       TIME        CUMULATIVE      PERCENTAGE
 scaling                    1.355       1.355           97.963%
 similarity                 0.013       1.367           0.926%
 convert ppm data           0.011       1.379           0.804%
 _stop_                     0.004       1.383           0.307%

[2018-08-09T09:46:40.0340 UTC] [debug] sending magic and exit
[2018-08-09T09:46:40.0341 UTC] [debug] received magic close
[2018-08-09T09:46:42.0162 UTC] [debug] backend process exited: 0
Unexpected end of data 0
[2018-08-09T09:46:42.0551 UTC] [debug] Driver backend collected unknown process with pid 91046 and exit status: 1
[2018-08-09T09:46:43.0558 UTC] [error] can_read received kill signal at /usr/lib/os-autoinst/myjsonrpc.pm line 89.

[2018-08-09T09:46:43.0620 UTC] [debug] commands process exited: 0
[2018-08-09T09:46:44.0622 UTC] [debug] sysread failed: 
[2018-08-09T09:46:44.0628 UTC] [debug] syswrite failed Broken pipe at /usr/lib/os-autoinst/myjsonrpc.pm line 38.
    myjsonrpc::send_json('GLOB(0x14cfd010)', 'HASH(0x11fdbb00)') called at /usr/lib/os-autoinst/autotest.pm line 282
    autotest::query_isotovideo('backend_last_screenshot_data') called at /usr/lib/os-autoinst/basetest.pm line 498
    basetest::_result_add_screenshot('await_install=HASH(0x14bb5b50)', 'HASH(0x14bb6138)') called at /usr/lib/os-autoinst/basetest.pm line 350
    basetest::runtest('await_install=HASH(0x14bb5b50)') called at /usr/lib/os-autoinst/autotest.pm line 328
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 327
    autotest::runalltests() called at /usr/lib/os-autoinst/autotest.pm line 183
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 183
    autotest::run_all() called at /usr/lib/os-autoinst/autotest.pm line 236
    autotest::__ANON__('Mojo::IOLoop::ReadWriteProcess=HASH(0x14d3d600)') called at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/IOLoop/ReadWriteProcess.pm line 309
    eval {...} called at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/IOLoop/ReadWriteProcess.pm line 309
    Mojo::IOLoop::ReadWriteProcess::_fork('Mojo::IOLoop::ReadWriteProcess=HASH(0x14d3d600)', 'CODE(0x14e49aa8)') called at /usr/lib/perl5/vendor_perl/5.18.2/Mojo/IOLoop/ReadWriteProcess.pm line 445
    Mojo::IOLoop::ReadWriteProcess::start('Mojo::IOLoop::ReadWriteProcess=HASH(0x14d3d600)') called at /usr/lib/os-autoinst/autotest.pm line 237
    autotest::start_process() called at /usr/bin/isotovideo line 265

[2018-08-09T09:46:44.0653 UTC] [debug] [autotest] process exited: 0
[2018-08-09T09:46:45.0655 UTC] [debug] isotovideo failed
[2018-08-09T09:46:45.0657 UTC] [debug] killing backend process 91044
[2018-08-09T09:46:45.0658 UTC] [debug] done with backend process
90728: EXIT 1

kernel (528 KB) szarate, 09/08/2018 01:51 pm

Screenshot-2018-8-9 Grafana - Worker Load.png (257 KB) szarate, 09/08/2018 02:21 pm

6563

History

#1 Updated by szarate over 1 year ago

So the machine's oom killer started to have fun

#2 Updated by EDiGiacinto over 1 year ago

It also seems swap was exhausted - @szarate pointed out that there are currently 40 workers on that instance, and i agree that possibly are too much and exhausting resources.

#3 Updated by szarate over 1 year ago

This explains a lot :) http://10.86.0.11:3000/dashboard/snapshot/0MKCPaH1jkcSeXWLc5jXHPMrQMQZjuKy?orgId=0

the worker host also has 41 workers enabled? Changing that too

#4 Updated by szarate over 1 year ago

It should be running only 30 workers (as per workers.ini file)...

#5 Updated by szarate over 1 year ago

  • Target version changed from Current Sprint to Done

Also available in: Atom PDF