action #39503
closedsvirt tests fail with unsupported update encoding -1733194013 at /.../consoles/VNC.pm line 988
Description
Since today all my svirt tests fail like this: http://nilgiri.suse.cz/tests/793/file/autoinst-log.txt
DIE unsupported update encoding -1733194013 at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/consoles/VNC.pm line 988.
at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 80.
backend::baseclass::die_handler("unsupported update encoding -1733194013 at /home/newman/openQ"...) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/consoles/VNC.pm line 803
consoles::VNC::catch {...} ("unsupported update encoding -1733194013 at /home/newman/openQ"...) called at /usr/lib/perl5/vendor_perl/5.26.2/Try/Tiny.pm line 123
Try::Tiny::try(CODE(0x5615fbafa768), Try::Tiny::Catch=REF(0x5615fbb23fc0)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/consoles/VNC.pm line 805
consoles::VNC::update_framebuffer(consoles::VNC=HASH(0x5615fbae56a8)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/consoles/vnc_base.pm line 102
consoles::vnc_base::current_screen(consoles::sshVirtsh=HASH(0x5615fca94e40)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 697
backend::baseclass::capture_screenshot(backend::svirt=HASH(0x5615fbb81f58)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 534
backend::baseclass::select_console(backend::svirt=HASH(0x5615fbb81f58), HASH(0x5615fba7dfa8)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 75
backend::baseclass::handle_command(backend::svirt=HASH(0x5615fbb81f58), HASH(0x5615fcc8a7d0)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 487
backend::baseclass::check_socket(backend::svirt=HASH(0x5615fbb81f58), IO::Handle=GLOB(0x5615fc33ac80), 0) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/svirt.pm line 235
backend::svirt::check_socket(backend::svirt=HASH(0x5615fbb81f58), IO::Handle=GLOB(0x5615fc33ac80), 0) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 246
eval {...} called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 156
backend::baseclass::run_capture_loop(backend::svirt=HASH(0x5615fbb81f58)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/baseclass.pm line 129
backend::baseclass::run(backend::svirt=HASH(0x5615fbb81f58), 13, 16) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/driver.pm line 77
backend::driver::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fccd3bd8)) called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 309
eval {...} called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 309
Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fccd3bd8), CODE(0x5615fcd40030)) called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 445
Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fccd3bd8)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/driver.pm line 79
backend::driver::start(backend::driver=HASH(0x5615fb9e8e90)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/backend/driver.pm line 50
backend::driver::new("backend::driver", "svirt") called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/isotovideo line 199
main::init_backend() called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/isotovideo line 267
And then:
[2018-08-09T12:40:17.0614 CEST] [debug] syswrite failed Broken pipe at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/myjsonrpc.pm line 38.
myjsonrpc::send_json(GLOB(0x5615f9e9b908), HASH(0x5615fba3a360)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 282
autotest::query_isotovideo("backend_last_screenshot_data") called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/basetest.pm line 498
basetest::_result_add_screenshot(bootloader_svirt=HASH(0x5615fb6d5420), HASH(0x5615f9eb0ed0)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/basetest.pm line 350
basetest::runtest(bootloader_svirt=HASH(0x5615fb6d5420)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 328
eval {...} called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 327
autotest::runalltests() called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 183
eval {...} called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 183
autotest::run_all() called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 236
autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fb9e8ec0)) called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 309
eval {...} called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 309
Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fb9e8ec0), CODE(0x5615fb95b608)) called at /usr/lib/perl5/vendor_perl/5.26.2/Mojo/IOLoop/ReadWriteProcess.pm line 445
Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0x5615fb9e8ec0)) called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/autotest.pm line 237
autotest::start_process() called at /home/newman/openQA/os-autoinst-distri-opensuse/os-autoinst/isotovideo line 265
Running on latest Tumbleweed with:
openQA-4.6.1533738973.5185383c-741.1.noarch
os-autoinst-4.5.1533739786.546c7c63-97.1.x86_64
perl-Mojo-Pg-4.08-1.2.noarch
perl-Mojolicious-Plugin-AssetPack-2.04-1.1.noarch
perl-Mojo-RabbitMQ-Client-0.2.0-1.1.noarch
perl-Mojolicious-7.88-1.1.noarch
perl-Mojolicious-Plugin-RenderFile-0.12-2.1.noarch
perl-Mojo-IOLoop-ReadWriteProcess-0.20-1.2.noarch
Files
Updated by michalnowak over 6 years ago
svirt jobs on OSD run OK, so shouldn't be related to the infra.
Updated by EDiGiacinto over 6 years ago
the second message should not add to the real problem (the failure is in the first one, the second is just a consequence afaict).
Did you upgraded your machine lately? can you link of jobs on osd which are going fine? It is always reproducible for you?
Also - could you try downgrading Mojolicious so we can exclude that it's not caused by it ? (i'm pretty sure it's not the cause, but i prefer to clean the air and be sure)
Thanks! (osd runs still on different versions, and we have already a wild mix)
Updated by michalnowak over 6 years ago
Yes, I upgraded Tumbleweed today and tomorrow, though I restarted only after I found jobs on my local openQA failing.
This job worked two days ago http://nilgiri.suse.cz/tests/753 for me locally, now it ends like this: http://nilgiri.suse.cz/tests/794. The same test suite on OSD passed today: https://openqa.suse.de/tests/1914213. I can reproduce it 100% on svirt, qemu backend on the other hand works for me.
Originally, I discussed the issue with Martin L. as I've heard he had issue with Mojo, so I downgraded perl-Mojolicious and other Mojo perl packages (except perl-Mojolicious-Plugin-AssetPack as openQA requires v2.04) to Leap 15.0 versions but the issue prevailed. Now I am back to Thumbleweed Mojo packages. I'll do couple of rollbacks to see if there's combination, which works for me.
Updated by EDiGiacinto over 6 years ago
Just double checked, ow2 (the one that executed your test) is running with os-autoinst from same version of master (currently, 546c7c633f891cfac598082258b2c39c2cfebcaf). I'm a bit inclined to think that this comes from a bunch of upgrades.
Updated by michalnowak over 6 years ago
Rolled back to Aug 1 snapshot and this is what works for me:
openQA-4.6.1532958106.40f0f07f-726.1.noarch
os-autoinst-4.5.1533044092.75f30a63-83.1.x86_64
perl-Mojo-IOLoop-ReadWriteProcess-0.20-1.2.noarch
perl-Mojo-Pg-4.08-1.2.noarch
perl-Mojo-RabbitMQ-Client-0.1.0-1.2.noarch
perl-Mojolicious-7.81-1.1.noarch
perl-Mojolicious-Plugin-AssetPack-2.04-1.1.noarch
perl-Mojolicious-Plugin-RenderFile-0.12-2.1.noarch
Updated by EDiGiacinto over 6 years ago
Ok - from that point, can you just upgrade os-autoinst (not the dependencies) and see if the problem is coming back?
Updated by michalnowak over 6 years ago
Once I install the VNC stuff from this update it breaks:
The following NEW package is going to be installed:
xorg-x11-Xvnc-module 1.9.0-1.1
The following 3 packages are going to be upgraded:
libXvnc1 1.8.0-13.1 -> 1.9.0-1.1
tigervnc 1.8.0-13.1 -> 1.9.0-1.1
xorg-x11-Xvnc 1.8.0-13.1 -> 1.9.0-1.1
Hence the unsupported update encoding -1733194013 at ...VNC.pm
error.
Updated by EDiGiacinto over 6 years ago
Thanks for investigating on this.
@foursixnine @coolo - i guess at some point we will have to handle new VNC versions, so maybe this is a good candidate for our Ready backlog.
Updated by coolo over 6 years ago
there is no such thing as 'new VNC versions' - this is all RFB 3.8. And update encoding -1733194013 is just a little over the top of what is defined:
https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#encodings
So either Xvnc is just buggy - or we misread some other frame very badly
Updated by coolo over 6 years ago
Oh - and just so you understand: in RFB the client tells the server its supported encodings. The server sending unsupported encodings is considered waste of bandwith.
Updated by EDiGiacinto over 6 years ago
Well if you like more in that way - if VNC is upgraded it might implement different RFB protocol version, or even some change could affect how we read, or simply we misread data.
But if that comes up right after an upgrade, it raises red flags to me.
Updated by EDiGiacinto over 6 years ago
And as you might heard already, a 4.x will happen sooner or later :) This upgrade will hit everybody anyway.. so you can call it a Xvnc buggy version, but we still have to figure out if this is something that we will have to deal also in future versions.
Updated by michalnowak over 6 years ago
Now I am on Tumbleweed 20180812 with libXvnc1, tigervnc, xorg-x11-Xvnc, and xorg-x11-Xvnc-module locked to version 1.8.0-13.1 and everything works, update to 1.9.0 breaks things.
Updated by oorlov over 6 years ago
Faced the same issue with svirt on kvm: http://oorlov-vm.qa.suse.de/tests/121/file/autoinst-log.txt
Updated by oorlov over 6 years ago
- Blocks action #36279: [sle][functional][u][s390x][medium] test fails in reboot_gnome on bsc#1085181 - help with investigation using new shutdown debug method added
Updated by riafarov over 6 years ago
- File tigervnc-1.8.0-lp150.9.1.x86_64.rpm tigervnc-1.8.0-lp150.9.1.x86_64.rpm added
- File libXvnc1-1.8.0-lp150.9.1.x86_64.rpm libXvnc1-1.8.0-lp150.9.1.x86_64.rpm added
I've attached the rpms I used on my TW, as they are not in the repo anymore. I had to uninstall xorg-x11-Xvnc-module, and could not find 1.8 version, but it works without it.
Updated by coolo over 6 years ago
- Blocks deleted (action #36279: [sle][functional][u][s390x][medium] test fails in reboot_gnome on bsc#1085181 - help with investigation using new shutdown debug method)
Updated by coolo over 6 years ago
- Status changed from New to Resolved
https://github.com/os-autoinst/os-autoinst/pull/1028 - if you need a quick fix, just set -261 to supported => 0 in your local installan's VNC.pm
Updated by okurz about 6 years ago
coolo wrote:
if you need a quick fix […]
Current packages in devel:openQA with version 4.5.1537682748.0d10ddb9 or newer already have the fix.
As the openQA-in-openQA tests in https://openqa.opensuse.org/tests/overview?distri=openqa&version=Tumbleweed&build=%3ATW.1761&groupid=24 fail, the job http://lord.arch.suse.de:8080/job/monitor-openQA_in_openQA-TW/lastFailedBuild/ fails and no submission to openSUSE:Factory is created yet. -> https://progress.opensuse.org/issues/41465 for that.
Updated by oorlov about 6 years ago
okurz wrote:
Current packages in devel:openQA with version 4.5.1537682748.0d10ddb9 or newer already have the fix.
The version you mentioned is released on 2018-09-23T06:05:48+00:00.
I have 4.6.1537792906.6c8c7f4a version (which is newer, 2018-09-24T12:41:46+00:00) and I still face the issue: http://oorlov-vm.qa.suse.de/tests/149/file/autoinst-log.txt
Updated by coolo about 6 years ago
4.6.X is an openQA version and irrelevant. Your log shows you're running 4.5.1536750184.92e52b69 of os-autoinst and that is old
Updated by szarate about 6 years ago
- Target version changed from Current Sprint to Done
Updated by mkittler about 5 years ago
- Related to action #60539: openQA remote-back-ends crashes with Xvnc >= 1.9.80 added