Project

General

Profile

action #91163

Many jobs on OSD and o3 are incomplete because of auto_review:"backend died: missing input at /usr/lib/os-autoinst/bmwqemu.pm line 202"

Added by Xiaojing_liu 4 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Concrete Bugs
Target version:
Start date:
2021-04-15
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

Many jobs on OSD and o3 are incomplete since the last deployment. The reason is backend died: missing input at /usr/lib/os-autoinst/bmwqemu.pm line 202.
The log message showed:

[2021-04-15T02:47:43.747 CEST] [debug] led state 0 1 1 -261
Use of uninitialized value $message in scalar chomp at /usr/lib/os-autoinst/backend/qemu.pm line 155.
Use of uninitialized value $rt in numeric eq (==) at /usr/lib/os-autoinst/backend/qemu.pm line 156.
Use of uninitialized value $message in scalar chomp at /usr/lib/os-autoinst/backend/qemu.pm line 155.
Use of uninitialized value $rt in numeric eq (==) at /usr/lib/os-autoinst/backend/qemu.pm line 156.
[2021-04-15T02:47:43.873 CEST] [debug] Open vSwitch networking status:
[2021-04-15T02:47:43.874 CEST] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
missing input at /usr/lib/os-autoinst/bmwqemu.pm line 202.
    bmwqemu::diag(undef) called at /usr/lib/os-autoinst/backend/qemu.pm line 1068
    backend::qemu::start_qemu(backend::qemu=HASH(0x55bee5a31e88)) called at /usr/lib/os-autoinst/backend/qemu.pm line 125
    backend::qemu::do_start_vm(backend::qemu=HASH(0x55bee5a31e88)) called at /usr/lib/os-autoinst/backend/baseclass.pm line 430
    backend::baseclass::start_vm(backend::qemu=HASH(0x55bee5a31e88), undef) called at /usr/lib/os-autoinst/backend/baseclass.pm line 89
    backend::baseclass::handle_command(backend::qemu=HASH(0x55bee5a31e88), HASH(0x55bee499fac0)) called at /usr/lib/os-autoinst/backend/baseclass.pm line 616
    backend::baseclass::check_socket(backend::qemu=HASH(0x55bee5a31e88), IO::Handle=GLOB(0x55bee49871b0)) called at /usr/lib/os-autoinst/backend/qemu.pm line 1183
    backend::qemu::check_socket(backend::qemu=HASH(0x55bee5a31e88), IO::Handle=GLOB(0x55bee49871b0), 0) called at /usr/lib/os-autoinst/backend/baseclass.pm line 273
    eval {...} called at /usr/lib/os-autoinst/backend/baseclass.pm line 190
    backend::baseclass::run_capture_loop(backend::qemu=HASH(0x55bee5a31e88)) called at /usr/lib/os-autoinst/backend/baseclass.pm line 146
    backend::baseclass::run(backend::qemu=HASH(0x55bee5a31e88), 14, 17) called at /usr/lib/os-autoinst/backend/driver.pm line 86
    backend::driver::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0x55bee0316460)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
    eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
    Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0x55bee0316460), CODE(0x55bee4157940)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 477
    Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0x55bee0316460)) called at /usr/lib/os-autoinst/backend/driver.pm line 87
    backend::driver::start(backend::driver=HASH(0x55bee0316508)) called at /usr/lib/os-autoinst/backend/driver.pm line 52
    backend::driver::new("backend::driver", "qemu") called at /usr/bin/isotovideo line 225
    main::init_backend() called at /usr/bin/isotovideo line 276

[2021-04-15T02:47:43.874 CEST] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
Use of uninitialized value $message in scalar chomp at /usr/lib/os-autoinst/backend/qemu.pm line 155.
Use of uninitialized value $rt in numeric eq (==) at /usr/lib/os-autoinst/backend/qemu.pm line 156.
[2021-04-15T02:47:44.922 CEST] [debug] Passing remaining frames to the video encoder
[2021-04-15T02:47:45.037 CEST] [debug] Waiting for video encoder to finalize the video

Not sure if it's a regression issue caused by https://github.com/os-autoinst/os-autoinst/pull/1641

Example:
https://openqa.suse.de/tests/5823910#dependencies
https://openqa.opensuse.org/tests/1699452#

History

#1 Updated by Xiaojing_liu 4 months ago

  • Description updated (diff)

#2 Updated by AdamWill 4 months ago

Oh, damn, yes, it probably is. This is perl's stupid "return the result of the last expression by default" thing. Before the PR, the last line of the function was $self->{dbus_object}->$fn(@args);, so I think we were returning the result of that. Now the last line is the dbus disconnect call.

I'll send a PR with what ought to be the fix, and see if it's possible to make the tests test it :(

Note as a quick workaround, this crash probably only happens if OVS_DEBUG is set. So unset it for now.

#3 Updated by AdamWill 4 months ago

https://github.com/os-autoinst/os-autoinst/pull/1644 sent, I tested and was able to recreate the bug and confirm that fixes it. Very sorry for the trouble.

#4 Updated by okurz 4 months ago

  • Subject changed from Many jobs on OSD and o3 are incomplete because of ' backend died: missing input at /usr/lib/os-autoinst/bmwqemu.pm line 202.' to Many jobs on OSD and o3 are incomplete because of auto_review:"backend died: missing input at /usr/lib/os-autoinst/bmwqemu.pm line 202"
  • Category set to Concrete Bugs
  • Status changed from New to In Progress
  • Assignee set to okurz
  • Priority changed from High to Immediate
  • Target version set to Ready

Xiaojing_liu doing a rollback on osd. me doing rollback on o3

#5 Updated by okurz 4 months ago

  • Priority changed from Immediate to High

Realized the following open points:

#6 Updated by AdamWill 4 months ago

Yeah, I did mention that in the PR. The tests never actually set up or mock a working dbus server and check the 'success' paths. They only check various different failure cases - the calls "really" failing because the service doesn't exist, and a mocked-up case of _dbus_do_call returning an error. It's not that easy with the current test setup to assert that we properly pass through the return values all the way to _dbus_call in 'normal operation'.

#7 Updated by okurz 4 months ago

  • Status changed from In Progress to Resolved

right. But as I stated in the PR, I guess that's ok. This is also why we have extracted these _do_dbus_call methods because at least we could mock these for tests of the other code :)

I have invited Xiaojing_liu to have a user account on o3 and added that point to https://progress.opensuse.org/projects/qa/wiki#Onboarding-for-new-joiners

Also available in: Atom PDF