Project

General

Profile

Actions

action #46556

closed

test incompletes when qemu crashes

Added by okurz almost 6 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
Regressions/Crashes
Target version:
-
Start date:
2019-01-23
Due date:
% Done:

0%

Estimated time:

Description

Observation

openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-update_Leap_42.3_gnome@okurz__os-autoinst-distri-opensuse_fix_vlc_027@64bit incompletes in https://openqa.opensuse.org/tests/836085 with

[2019-01-23T15:03:33.097 CET] [debug] Backend process died, backend errors are reported below in the following lines:
Can't syswrite(IO::Socket::UNIX=GLOB(0x55f27f04a6a8), <BUFFER>): Broken pipe at /usr/lib/os-autoinst/backend/qemu.pm line 956

[2019-01-23T15:03:33.098 CET] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
last frame
[2019-01-23T15:03:33.129 CET] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
[2019-01-23T15:03:33.131 CET] [debug] QEMU: Formatting '/var/lib/openqa/pool/3/raid/hd0-overlay6', fmt=qcow2 size=42949672960 backing_file=/var/lib/openqa/pool/3/raid/hd0-overlay5 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
[2019-01-23T15:03:33.131 CET] [debug] QEMU: Formatting '/var/lib/openqa/pool/3/raid/cd0-overlay6', fmt=qcow2 size=4601151488 backing_file=/var/lib/openqa/pool/3/raid/cd0-overlay5 backing_fmt=qcow2 cluster_size=65536 lazy_refcounts=off refcount_bits=16
[2019-01-23T15:03:33.132 CET] [debug] sending magic and exit
[2019-01-23T15:03:33.132 CET] [debug] received magic close
[2019-01-23T15:03:33.133 CET] [debug] THERE IS NOTHING TO READ 15 4 3
[2019-01-23T15:03:33.153 CET] [debug] backend process exited: 0
[2019-01-23T15:03:33.157 CET] [debug] commands process exited: 0
[2019-01-23T15:03:34.157 CET] [debug] sysread failed: 
[2019-01-23T15:03:34.158 CET] [debug] ||| starting zypper_lr tests/console/zypper_lr.pm
syswrite failed Broken pipe at /usr/lib/os-autoinst/myjsonrpc.pm line 40.
    myjsonrpc::send_json(GLOB(0x55f27b471e58), HASH(0x55f27b4aebc8)) called at /usr/lib/os-autoinst/autotest.pm line 313
    autotest::query_isotovideo("set_current_test", HASH(0x55f28050af58)) called at /usr/lib/os-autoinst/autotest.pm line 181
    autotest::set_current_test(zypper_lr=HASH(0x55f27ca7eb48)) called at /usr/lib/os-autoinst/basetest.pm line 272
    basetest::start(zypper_lr=HASH(0x55f27ca7eb48)) called at /usr/lib/os-autoinst/autotest.pm line 350
    autotest::runalltests() called at /usr/lib/os-autoinst/autotest.pm line 214
    eval {...} called at /usr/lib/os-autoinst/autotest.pm line 214
    autotest::run_all() called at /usr/lib/os-autoinst/autotest.pm line 267
    autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0x55f27cf460f0)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 325
    eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 325
    Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0x55f27cf460f0), CODE(0x55f27cf27890)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 476
    Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0x55f27cf460f0)) called at /usr/lib/os-autoinst/autotest.pm line 268
    autotest::start_process() called at /usr/bin/isotovideo line 251

Reproducible

Seen once so far

Actions #1

Updated by coolo almost 6 years ago

  • Subject changed from test incompletes with "Can't syswrite(IO::Socket::UNIX=GLOB(0x55f27f04a6a8), <BUFFER>): Broken pipe at /usr/lib/os-autoinst/backend/qemu.pm line 956" to test incompletes when qemu crashes

Well, what do you expect us to do?

Jan 23 15:03:33 openqaworker4 kernel: qemu-system-x86[8013]: segfault at 55641f6d9418 ip 0000555acad96d2f sp 00007f6220ff8660 error 4 in qemu-system-x86_64[555aca7de000+8fa000]

Actions #2

Updated by okurz almost 6 years ago

Well, for example I assume we can inform the user about exactly this that qemu segfaults? Did you report this as a bug or is it reported already?

Actions #3

Updated by okurz about 5 years ago

Something similar:

https://openqa.opensuse.org/tests/1082087/file/autoinst-log.txt is incomplete with

[2019-11-11T09:38:22.715 CET] [debug] <<< testapi::assert_screen(mustmatch=[
  'displaymanager',
  'emergency-shell',
  'emergency-mode'
], timeout=200)
[2019-11-11T09:38:23.889 CET] [debug] no match: 199.3s, best candidate: displaymanager-gdm-20170703 (0.17)
[2019-11-11T09:38:24.400 CET] [debug] no change: 198.3s
[2019-11-11T09:38:25.401 CET] [debug] no change: 197.3s
[2019-11-11T09:38:26.673 CET] [debug] no match: 196.3s, best candidate: displaymanager-gdm-20170703 (0.17)
[2019-11-11T09:38:30.970 CET] [debug] WARNING: check_asserted_screen took 3.57 seconds for 27 candidate needles - make your needles more specific
[2019-11-11T09:38:30.970 CET] [debug] no match: 195.3s, best candidate: displaymanager-gdm-20170703 (0.17)
[2019-11-11T09:38:30.977 CET] [debug] no change: 191.7s
[2019-11-11T09:38:31.220 CET] [debug] considering VNC stalled, no update for 4.83 seconds
[2019-11-11T09:38:31.475 CET] [debug] pointer type 1 0 1024 768 -257
[2019-11-11T09:38:31.475 CET] [debug] led state 0 1 1 -261
[2019-11-11T09:38:31.971 CET] [debug] no change: 190.7s
[2019-11-11T09:38:32.973 CET] [debug] no change: 189.7s
*** Error in `/usr/bin/isotovideo: backen': corrupted size vs. prev_size: 0x000055b1ed073f30 ***
[2019-11-11T09:38:33.998 CET] [debug] backend process exited: 0
[2019-11-11T09:38:33.999 CET] [debug] sysread failed: 
[2019-11-11T09:38:33.999 CET] [debug] killing command server 17315 because test execution ended
[2019-11-11T09:38:33.999 CET] [debug] isotovideo: informing websocket clients before stopping command server: http://127.0.0.1:20053/1j2iZBIhATnNg11D/broadcast

and another incomplete also in cryptlvm with very similar symptoms but no "corrupted size vs. prev_size":
https://openqa.opensuse.org/tests/1082003/file/autoinst-log.txt

[2019-11-11T06:39:32.416 CET] [debug] <<< testapi::assert_screen(mustmatch=[
  'displaymanager',
  'emergency-shell',
  'emergency-mode'
], timeout=200)
[2019-11-11T06:39:33.598 CET] [debug] WARNING: check_asserted_screen took 0.51 seconds for 27 candidate needles - make your needles more specific
[2019-11-11T06:39:33.598 CET] [debug] no match: 199.3s, best candidate: displaymanager-gdm-20170703 (0.17)
[2019-11-11T06:39:34.089 CET] [debug] no change: 198.3s
[2019-11-11T06:39:35.090 CET] [debug] no change: 197.3s
[2019-11-11T06:39:36.091 CET] [debug] no change: 196.3s
[2019-11-11T06:39:40.399 CET] [debug] WARNING: check_asserted_screen took 3.31 seconds for 27 candidate needles - make your needles more specific
[2019-11-11T06:39:40.399 CET] [debug] no match: 195.3s, best candidate: displaymanager-gdm-20170703 (0.17)
[2019-11-11T06:39:40.407 CET] [debug] no change: 192.0s
[2019-11-11T06:39:40.650 CET] [debug] considering VNC stalled, no update for 4.55 seconds
[2019-11-11T06:39:40.904 CET] [debug] pointer type 1 0 1024 768 -257
[2019-11-11T06:39:40.905 CET] [debug] led state 0 1 1 -261
[2019-11-11T06:39:41.401 CET] [debug] no change: 191.0s
[2019-11-11T06:39:42.402 CET] [debug] no change: 190.0s
[2019-11-11T06:39:43.427 CET] [debug] backend process exited: 0
[2019-11-11T06:39:43.427 CET] [debug] sysread failed: 
[2019-11-11T06:39:43.428 CET] [debug] killing command server 32296 because test execution ended
[2019-11-11T06:39:43.428 CET] [debug] isotovideo: informing websocket clients before stopping command server: http://127.0.0.1:20123/hNN9yfN7DiY_0Uzu/broadcast
Actions #4

Updated by okurz about 5 years ago

  • Status changed from New to Resolved
  • Assignee set to okurz

Maybe https://github.com/os-autoinst/os-autoinst/pull/1269 is enough to at least get rid of the confusing stack trac of myjsonrpc when the error is by qemu itself. With my work in https://github.com/os-autoinst/scripts/blob/master/openqa-label-known-issues I will try to find any potential future appearance and see what we can do to make the real errors more prominent.

Actions

Also available in: Atom PDF