action #39503

svirt tests fail with unsupported update encoding -1733194013 at /.../consoles/VNC.pm line 988

Added by michalnowak over 1 year ago. Updated over 1 year ago.

Status:ResolvedStart date:09/08/2018
Priority:HighDue date:
Assignee:coolo% Done:

0%

Category:Concrete Bugs
Target version:Done
Difficulty:
Duration:

Description

Since today all my svirt tests fail like this: http://nilgiri.suse.cz/tests/793/file/autoinst-log.txt

DIE unsupported update encoding -1733194013 at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/consoles/VNC.pm line 988.

 at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 80.
    backend::baseclass::die_handler("unsupported update encoding -1733194013 at /home/newman/openQ"...) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/consoles/VNC.pm line 803
    consoles::VNC::catch {...} ("unsupported update encoding -1733194013 at /home/newman/openQ"...) called at /usr/lib/perl5/vendor_perl/5.26.2/Try/Tiny.pm line 123
    Try::Tiny::try(CODE(0x5615fbafa768), Try::Tiny::Catch=REF(0x5615fbb23fc0)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/consoles/VNC.pm line 805
    consoles::VNC::update_framebuffer(consoles::VNC=HASH(0x5615fbae56a8)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/consoles/vnc_base.pm line 102
    consoles::vnc_base::current_screen(consoles::sshVirtsh=HASH(0x5615fca94e40)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 697
    backend::baseclass::capture_screenshot(backend::svirt=HASH(0x5615fbb81f58)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 534
    backend::baseclass::select_console(backend::svirt=HASH(0x5615fbb81f58), HASH(0x5615fba7dfa8)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 75
    backend::baseclass::handle_command(backend::svirt=HASH(0x5615fbb81f58), HASH(0x5615fcc8a7d0)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 487
    backend::baseclass::check_socket(backend::svirt=HASH(0x5615fbb81f58), IO::Handle=GLOB(0x5615fc33ac80), 0) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/svirt.pm line 235
    backend::svirt::check_socket(backend::svirt=HASH(0x5615fbb81f58), IO::Handle=GLOB(0x5615fc33ac80), 0) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 246
    eval {...} called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 156
    backend::baseclass::run_capture_loop(backend::svirt=HASH(0x5615fbb81f58)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 129
    backend::baseclass::run(backend::svirt=HASH(0x5615fbb81f58), 13, 16) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/driver.pm line 77
    backend::driver::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fccd3bd8)) called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 309
    eval {...} called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 309
    Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fccd3bd8), CODE(0x5615fcd40030)) called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 445
    Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fccd3bd8)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/driver.pm line 79
    backend::driver::start(backend::driver=HASH(0x5615fb9e8e90)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/driver.pm line 50
    backend::driver::new("backend::driver", "svirt") called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/isotovideo line 199
    main::init_backend() called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/isotovideo line 267

And then:

[2018-08-09T12:40:17.0614 CEST] [debug] syswrite failed Broken pipe at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/myjsonrpc.pm line 38.
    myjsonrpc::send_json(GLOB(0x5615f9e9b908), HASH(0x5615fba3a360)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 282
    autotest::query_isotovideo("backend_last_screenshot_data") called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/basetest.pm line 498
    basetest::_result_add_screenshot(bootloader_svirt=HASH(0x5615fb6d5420), HASH(0x5615f9eb0ed0)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/basetest.pm line 350
    basetest::runtest(bootloader_svirt=HASH(0x5615fb6d5420)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 328
    eval {...} called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 327
    autotest::runalltests() called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 183
    eval {...} called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 183
    autotest::run_all() called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 236
    autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fb9e8ec0)) called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 309
    eval {...} called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 309
    Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fb9e8ec0), CODE(0x5615fb95b608)) called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 445
    Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fb9e8ec0)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 237
    autotest::start_process() called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/isotovideo line 265

Running on latest Tumbleweed with:

openQA-4.6.1533738973.5185383c-741.1.noarch
os-autoinst-4.5.1533739786.546c7c63-97.1.x86_64
perl-Mojo-Pg-4.08-1.2.noarch
perl-Mojolicious-Plugin-AssetPack-2.04-1.1.noarch
perl-Mojo-RabbitMQ-Client-0.2.0-1.1.noarch
perl-Mojolicious-7.88-1.1.noarch
perl-Mojolicious-Plugin-RenderFile-0.12-2.1.noarch
perl-Mojo-IOLoop-ReadWriteProcess-0.20-1.2.noarch

tigervnc-1.8.0-lp150.9.1.x86_64.rpm (247 KB) riafarov, 14/09/2018 09:50 am

libXvnc1-1.8.0-lp150.9.1.x86_64.rpm (23.8 KB) riafarov, 14/09/2018 09:50 am


Related issues

Related to openQA Project - action #60539: openQA remote-back-ends crashes with Xvnc >= 1.9.80 Resolved 03/12/2019

History

#1 Updated by michalnowak over 1 year ago

svirt jobs on OSD run OK, so shouldn't be related to the infra.

#2 Updated by EDiGiacinto over 1 year ago

the second message should not add to the real problem (the failure is in the first one, the second is just a consequence afaict).

Did you upgraded your machine lately? can you link of jobs on osd which are going fine? It is always reproducible for you?

Also - could you try downgrading Mojolicious so we can exclude that it's not caused by it ? (i'm pretty sure it's not the cause, but i prefer to clean the air and be sure)
Thanks! (osd runs still on different versions, and we have already a wild mix)

#3 Updated by michalnowak over 1 year ago

Yes, I upgraded Tumbleweed today and tomorrow, though I restarted only after I found jobs on my local openQA failing.

This job worked two days ago http://nilgiri.suse.cz/tests/753 for me locally, now it ends like this: http://nilgiri.suse.cz/tests/794. The same test suite on OSD passed today: https://openqa.suse.de/tests/1914213. I can reproduce it 100% on svirt, qemu backend on the other hand works for me.

Originally, I discussed the issue with Martin L. as I've heard he had issue with Mojo, so I downgraded perl-Mojolicious and other Mojo perl packages (except perl-Mojolicious-Plugin-AssetPack as openQA requires v2.04) to Leap 15.0 versions but the issue prevailed. Now I am back to Thumbleweed Mojo packages. I'll do couple of rollbacks to see if there's combination, which works for me.

#4 Updated by EDiGiacinto over 1 year ago

Just double checked, ow2 (the one that executed your test) is running with os-autoinst from same version of master (currently, 546c7c633f891cfac598082258b2c39c2cfebcaf). I'm a bit inclined to think that this comes from a bunch of upgrades.

#5 Updated by michalnowak over 1 year ago

Rolled back to Aug 1 snapshot and this is what works for me:

openQA-4.6.1532958106.40f0f07f-726.1.noarch
os-autoinst-4.5.1533044092.75f30a63-83.1.x86_64
perl-Mojo-IOLoop-ReadWriteProcess-0.20-1.2.noarch
perl-Mojo-Pg-4.08-1.2.noarch
perl-Mojo-RabbitMQ-Client-0.1.0-1.2.noarch
perl-Mojolicious-7.81-1.1.noarch
perl-Mojolicious-Plugin-AssetPack-2.04-1.1.noarch
perl-Mojolicious-Plugin-RenderFile-0.12-2.1.noarch

#6 Updated by EDiGiacinto over 1 year ago

Ok - from that point, can you just upgrade os-autoinst (not the dependencies) and see if the problem is coming back?

#7 Updated by michalnowak over 1 year ago

Once I install the VNC stuff from this update it breaks:

The following NEW package is going to be installed:
  xorg-x11-Xvnc-module  1.9.0-1.1

The following 3 packages are going to be upgraded:
  libXvnc1       1.8.0-13.1 -> 1.9.0-1.1
  tigervnc       1.8.0-13.1 -> 1.9.0-1.1
  xorg-x11-Xvnc  1.8.0-13.1 -> 1.9.0-1.1

Hence the unsupported update encoding -1733194013 at ...VNC.pm error.

#8 Updated by EDiGiacinto over 1 year ago

Thanks for investigating on this.

@foursixnine @coolo - i guess at some point we will have to handle new VNC versions, so maybe this is a good candidate for our Ready backlog.

#9 Updated by coolo over 1 year ago

there is no such thing as 'new VNC versions' - this is all RFB 3.8. And update encoding -1733194013 is just a little over the top of what is defined:
https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#encodings

So either Xvnc is just buggy - or we misread some other frame very badly

#10 Updated by coolo over 1 year ago

Oh - and just so you understand: in RFB the client tells the server its supported encodings. The server sending unsupported encodings is considered waste of bandwith.

#11 Updated by EDiGiacinto over 1 year ago

Well if you like more in that way - if VNC is upgraded it might implement different RFB protocol version, or even some change could affect how we read, or simply we misread data.

But if that comes up right after an upgrade, it raises red flags to me.

#12 Updated by EDiGiacinto over 1 year ago

And as you might heard already, a 4.x will happen sooner or later :) This upgrade will hit everybody anyway.. so you can call it a Xvnc buggy version, but we still have to figure out if this is something that we will have to deal also in future versions.

#13 Updated by michalnowak over 1 year ago

Now I am on Tumbleweed 20180812 with libXvnc1, tigervnc, xorg-x11-Xvnc, and xorg-x11-Xvnc-module locked to version 1.8.0-13.1 and everything works, update to 1.9.0 breaks things.

#14 Updated by oorlov over 1 year ago

Faced the same issue with svirt on kvm: http://oorlov-vm.qa.suse.de/tests/121/file/autoinst-log.txt

#15 Updated by oorlov over 1 year ago

  • Blocks action #36279: [sle][functional][u][s390x][medium] test fails in reboot_gnome on bsc#1085181 - help with investigation using new shutdown debug method added

#16 Updated by riafarov over 1 year ago

I've attached the rpms I used on my TW, as they are not in the repo anymore. I had to uninstall xorg-x11-Xvnc-module, and could not find 1.8 version, but it works without it.

#17 Updated by coolo over 1 year ago

  • Target version set to Current Sprint

#18 Updated by coolo over 1 year ago

  • Assignee set to coolo

#19 Updated by coolo over 1 year ago

  • Blocks deleted (action #36279: [sle][functional][u][s390x][medium] test fails in reboot_gnome on bsc#1085181 - help with investigation using new shutdown debug method)

#20 Updated by coolo over 1 year ago

  • Status changed from New to Resolved

https://github.com/os-autoinst/os-autoinst/pull/1028 - if you need a quick fix, just set -261 to supported => 0 in your local installan's VNC.pm

#21 Updated by okurz over 1 year ago

coolo wrote:

if you need a quick fix […]

Current packages in devel:openQA with version 4.5.1537682748.0d10ddb9 or newer already have the fix.

As the openQA-in-openQA tests in https://openqa.opensuse.org/tests/overview?distri=openqa&version=Tumbleweed&build=%3ATW.1761&groupid=24 fail, the job http://lord.arch.suse.de:8080/job/monitor-openQA_in_openQA-TW/lastFailedBuild/ fails and no submission to openSUSE:Factory is created yet. -> https://progress.opensuse.org/issues/41465 for that.

#22 Updated by oorlov over 1 year ago

okurz wrote:

Current packages in devel:openQA with version 4.5.1537682748.0d10ddb9 or newer already have the fix.

The version you mentioned is released on 2018-09-23T06:05:48+00:00.

I have 4.6.1537792906.6c8c7f4a version (which is newer, 2018-09-24T12:41:46+00:00) and I still face the issue: http://oorlov-vm.qa.suse.de/tests/149/file/autoinst-log.txt

#23 Updated by coolo over 1 year ago

4.6.X is an openQA version and irrelevant. Your log shows you're running 4.5.1536750184.92e52b69 of os-autoinst and that is old

#24 Updated by szarate over 1 year ago

  • Target version changed from Current Sprint to Done

#25 Updated by mkittler 4 months ago

  • Related to action #60539: openQA remote-back-ends crashes with Xvnc >= 1.9.80 added

Also available in: Atom PDF