action #76813
opencoordination #125708: [epic] Future ideas for more stable non-qemu backends
[tools] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused"
0%
Description
Observation¶
- See title and https://openqa.suse.de/tests/4913987
- At the time of creating the ticket the job has been restarted manually and the clone passed.
[2022-01-13T13:57:24.238682+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
Steps to reproduce¶
Find jobs referencing this ticket with the help of
https://raw.githubusercontent.com/os-autoinst/scripts/master/openqa-query-for-job-label ,
call openqa-query-for-job-label poo#76813
Acceptance criteria¶
- AC1: It is clear to test reviewers what needs to be done in the next step when reading the reason
Suggestions¶
This looks like a temporary problem with the svirt host (here openqaw5-xen.qa.suse.de
). But it is not clear whether the problem is always temporary and an automatic restart would make sense. I mainly created this ticket so jobs like this one won't show up in auto-review anymore.
- What workers are affected? Is it openqaworker2 and grenache only? Yes, obviously because only them use the svirt backend this way.
- Focus on tests trying to connect to a remote machine
- Add better log messages explaining to test reviewers what the message could mean
- Make sure the test ends with a proper reason propagated into the job results
Out of scope¶
- VNC running on localhost, see #105882 about that
Workaround¶
Just retry
Updated by mkittler almost 4 years ago
- Project changed from openQA Infrastructure to openQA Project
Updated by okurz almost 4 years ago
- Tags set to qac, jeos
- Project changed from openQA Project to openQA Tests
- Category set to Infrastructure
- Assignee set to jlausuch
one more candidate for QAC as this is https://progress.opensuse.org/projects/qa/wiki#Out-of-scope for QE Tools :)
Updated by jlausuch almost 4 years ago
Not sure we are gonna fix this backend failure, however I keep the ticket to tag failing tests when it happens again.
Updated by okurz almost 4 years ago
jlausuch wrote:
Not sure we are gonna fix this backend failure, however I keep the ticket to tag failing tests when it happens again.
Maybe I can help to provide some additional information for you to be able to fix it? If a related failure only happens very sporadically probably it is ok to just have a ticket for reference but for the sake of long-term sustainability I suggest you as a team still invest in improving the overall situation. Likely there is not much of an issue to "fix" but to just improve the feedback from the backend if something like this happens. I mean "Connection refused" means that one thing tries to reach another thing which does not work. To help test reviewers to understand what is going on it could help to find out what are these "things", what are likely causes for this to happen, provide some additional information for debugging, e.g. is the target machine at least reachable? Are all prerequisities fulfilled, etc.
Updated by okurz over 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: jeos-base+phub@svirt-xen-pv
https://openqa.suse.de/tests/5303636
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz over 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: btrfs@64bit-ipmi
https://openqa.suse.de/tests/5433647
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by openqa_review over 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: btrfs@64bit-ipmi
https://openqa.suse.de/tests/5433647
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by openqa_review over 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: jeos-base+phub@svirt-hyperv-uefi
https://openqa.suse.de/tests/5794800
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by tinita about 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: default@svirt-hyperv
https://openqa.suse.de/tests/6492141
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The label in the openQA scenario is removed
Updated by tinita about 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: default@svirt-hyperv
https://openqa.suse.de/tests/6492141
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The label in the openQA scenario is removed
Updated by openqa_review about 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: default@svirt-hyperv-uefi
https://openqa.suse.de/tests/6888454
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The label in the openQA scenario is removed
Updated by openqa_review about 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: btrfs@64bit-ipmi
https://openqa.suse.de/tests/7048944
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The label in the openQA scenario is removed
Updated by openqa_review almost 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: select_modules_and_patterns+registration@svirt-hyperv-uefi
https://openqa.suse.de/tests/7217519
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Updated by openqa_review almost 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: jeos-base+sdk+desktop@svirt-hyperv
https://openqa.suse.de/tests/7362868
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Updated by jlausuch almost 3 years ago
- Status changed from New to Workable
- Assignee deleted (
jlausuch)
Some PR that might solve the bootloader hyperv problem, but not the root cause of this issue
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/13496
Updated by openqa_review almost 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: jeos-base+sdk+desktop@svirt-hyperv
https://openqa.suse.de/tests/7619895
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Updated by openqa_review almost 3 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: btrfs@64bit-ipmi
https://openqa.suse.de/tests/7784343
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Updated by okurz over 2 years ago
- Project changed from openQA Tests to Containers
- Category deleted (
Infrastructure)
@jlausuch https://openqa.suse.de/tests/7619895 is jeos-base+sdk+desktop@svirt-hyperv, so that's in your domain, right?
Updated by jlausuch over 2 years ago
- Project changed from Containers to openQA Tests
okurz wrote:
@jlausuch https://openqa.suse.de/tests/7619895 is jeos-base+sdk+desktop@svirt-hyperv, so that's in your domain, right?
A couple of things:
First, that's a jeos job, so it's not "containers" domain :)
Second, It isn't really related to JeOS, as it's a backend issue.
Third, I haven't seen this issue happening since 1 month (at least in JeOS).
Updated by okurz over 2 years ago
- Subject changed from Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused" to [qe-core] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused"
- Category set to Bugs in existing tests
uhm, ok. So https://openqa.suse.de/parent_group_overview/34#grouped_by_build unfortunately does not have a description. I assume it's QE-Core then?
Updated by jlausuch over 2 years ago
okurz wrote:
uhm, ok. So https://openqa.suse.de/parent_group_overview/34#grouped_by_build unfortunately does not have a description. I assume it's QE-Core then?
I don't think it "belongs" to any particular squad. It's a backend issue, it belongs (affects) to everyone, so I don't know what to answer :)
Updated by livdywan over 2 years ago
I just ran into it:
https://openqa.suse.de/tests/7968548
The second run https://openqa.suse.de/tests/7968555 failed the same way.
Edit: Third https://openqa.suse.de/tests/7968565
And the force https://openqa.suse.de/tests/7968567
Possibly interesting excerpt from the logs, looks to me like there's an I/O issue stalling or crashing VNC.
Couldn't get a file descriptor referring to the console
%@umount: /var/lib/nfs/rpc_pipefs: target is busy
(In some cases useful info about processes that
use the device is found by lsof(8) or fuser(1).)
Main Menu
0) <-- Back <--
1) Start Installation
2) Settings
3) Expert
4) Exit or Reboot
[2022-01-13T13:57:19.695523+01:00] [debug] no change: 56.9s
[2022-01-13T13:57:22.234352+01:00] [debug] WARNING: check_asserted_screen took 1.54 seconds for 15 candidate needles - make your needles more specific
[2022-01-13T13:57:22.234563+01:00] [debug] no match: 55.9s, best candidate: addon-products-20190923 (0.00)
[2022-01-13T13:57:22.234890+01:00] [debug] considering VNC stalled, no update for 5.07 seconds
[2022-01-13T13:57:24.238682+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
[2022-01-13T13:57:25.240025+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
[2022-01-13T13:57:26.241758+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
[2022-01-13T13:57:27.243136+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
[2022-01-13T13:57:28.244558+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
[2022-01-13T13:57:29.245954+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
[2022-01-13T13:57:30.247334+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
[2022-01-13T13:57:31.248793+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
Updated by livdywan over 2 years ago
- Has duplicate action #99345: [tools][qem] Incomplete test runs on s390x with auto_review:"backend died: Error connecting to VNC server.*s390.*Connection timed out":retry size:M added
Updated by livdywan over 2 years ago
- Related to action #104520: Move svirt extract_asset code from os-autoinst-distri-opensuse to os-autoinst/backend/svirt.pm size:M auto_review:"unable to extract assets: Can't call method.+name.+on an undefined value":retry added
Updated by apappas over 2 years ago
- Subject changed from [qe-core] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused" to [tools] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused"
- Status changed from Workable to Feedback
This ticket was discussed in the qe-core refinement session.
- Apparently it is a backend issue, and we as a team have neither the mission nor the skills to fix backend issues.
- The jobs linked are JeOS jobs, which are definitely not in our domain.
As a result, we ask for feedback from the qe-tools team. If they don't take up the ticket we will reject it.
Updated by okurz over 2 years ago
- Description updated (diff)
- Status changed from Feedback to New
- Target version set to Ready
ok. I will pick it up for tools and we will once again see what we can do and check on the scope and responsibility.
Updated by okurz over 2 years ago
From today limit=50 openqa-query-for-job-label poo#76813
8089855|2022-02-03 08:56:42|done|incomplete|migration_offline_dvd_sle15sp1_ltss_ha_alpha_node02|backend died: Error connecting to VNC server <localhost:47095>: IO::Socket::INET: connect: Connection refused|grenache-1
8089206|2022-02-03 06:02:49|done|incomplete|textmode_ha_minimal_base|backend died: Error connecting to VNC server <localhost:34795>: IO::Socket::INET: connect: Connection refused|grenache-1
8089197|2022-02-03 05:52:09|done|incomplete|migration_online_sle15sp3_ha_alpha_node01|backend died: Error connecting to VNC server <localhost:45839>: IO::Socket::INET: connect: Connection refused|grenache-1
8089196|2022-02-03 05:51:13|done|incomplete|migration_online_sle15sp2_ha_alpha_node02|backend died: Error connecting to VNC server <localhost:60041>: IO::Socket::INET: connect: Connection refused|grenache-1
8089195|2022-02-03 05:49:51|done|incomplete|migration_online_sle15sp2_ha_alpha_node01|backend died: Error connecting to VNC server <localhost:60589>: IO::Socket::INET: connect: Connection refused|grenache-1
8089178|2022-02-03 05:39:40|done|incomplete|migration_offline_scc_sle15sp2_ha_alpha_node02|backend died: Error connecting to VNC server <localhost:43133>: IO::Socket::INET: connect: Connection refused|grenache-1
8089171|2022-02-03 05:29:21|done|incomplete|migration_offline_scc_sle12sp4_ltss_ha_alpha_node01|backend died: Error connecting to VNC server <localhost:40687>: IO::Socket::INET: connect: Connection refused|grenache-1
8088391|2022-02-03 05:26:32|done|incomplete|gi-guest_developing-on-host_sles15sp3-kvm|backend died: Error connecting to VNC server <localhost:40339>: IO::Socket::INET: connect: Connection refused|grenache-1
8089169|2022-02-03 05:22:35|done|incomplete|migration_offline_scc_sle12sp3_ltss_ha_alpha_node01|backend died: Error connecting to VNC server <localhost:47485>: IO::Socket::INET: connect: Connection refused|grenache-1
8089166|2022-02-03 05:18:20|done|incomplete|ha_textmode_extended|backend died: Error connecting to VNC server <localhost:36727>: IO::Socket::INET: connect: Connection refused|grenache-1
8093592|2022-02-03 05:13:22|done|incomplete|ha_textmode_extended|backend died: Error connecting to VNC server <localhost:51349>: IO::Socket::INET: connect: Connection refused|grenache-1
8093590|2022-02-03 05:08:49|done|incomplete|textmode_ha_minimal_base|backend died: Error connecting to VNC server <localhost:47857>: IO::Socket::INET: connect: Connection refused|grenache-1
8093593|2022-02-03 02:17:10|done|incomplete|create_hdd_ha_textmode|backend died: Error connecting to VNC server <localhost:45515>: IO::Socket::INET: connect: Connection refused|grenache-1
8090209|2022-02-03 02:15:10|done|incomplete|jeos-filesystem|backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused|openqaworker2
8088393|2022-02-03 01:19:09|done|incomplete|gi-guest_sles15sp3-on-host_developing-kvm|backend died: Error connecting to VNC server <localhost:35653>: IO::Socket::INET: connect: Connection refused|grenache-1
8089763|2022-02-03 00:59:16|done|incomplete|virt-guest-migration-sles15sp3-from-sles15sp3-to-developing-xen-dst|backend died: Error connecting to VNC server <localhost:36179>: IO::Socket::INET: connect: Connection refused|openqaworker2
8090393|2022-02-03 00:53:27|done|incomplete|offline_sles15sp1_ltss_pscc_basesys-srv-lgm-pcm_def_full|backend died: Error connecting to VNC server <localhost:48449>: IO::Socket::INET: connect: Connection refused|grenache-1
8089761|2022-02-03 00:50:10|done|incomplete|virt-guest-migration-sles15sp3-from-sles15sp3-to-developing-kvm-dst|backend died: Error connecting to VNC server <localhost:37713>: IO::Socket::INET: connect: Connection refused|openqaworker2
8090379|2022-02-03 00:48:51|done|incomplete|offline_sles15sp3_pscc_base_all_minimal_zVM_preparation|backend died: Error connecting to VNC server <localhost:44575>: IO::Socket::INET: connect: Connection refused|openqaworker2
8089681|2022-02-03 00:40:10|done|incomplete|videomode_text+textmode+role_kvm|backend died: Error connecting to VNC server <localhost:38541>: IO::Socket::INET: connect: Connection refused|grenache-1
8089760|2022-02-03 00:37:49|done|incomplete|virt-guest-migration-sles12sp5-from-sles12sp5-to-developing-kvm-dst|backend died: Error connecting to VNC server <localhost:51769>: IO::Socket::INET: connect: Connection refused|openqaworker2
8090387|2022-02-03 00:34:38|done|incomplete|online_sles15sp1_ltss_pscc_basesys-srv-phub_def_full_y|backend died: Error connecting to VNC server <localhost:34933>: IO::Socket::INET: connect: Connection refused|grenache-1
8090383|2022-02-03 00:28:59|done|incomplete|online_sles15sp3_pscc_basesys-srv-lgm-pcm_def_full_zdup|backend died: Error connecting to VNC server <localhost:43337>: IO::Socket::INET: connect: Connection refused|grenache-1
8089756|2022-02-03 00:28:10|done|incomplete|virt-guest-migration-sles12sp5-from-sles12sp5-to-developing-xen-dst|backend died: Error connecting to VNC server <localhost:47225>: IO::Socket::INET: connect: Connection refused|openqaworker2
8090385|2022-02-03 00:23:51|done|incomplete|online_sles15sp1_ltss_pscc_basesys-srv_def_full_zdup|backend died: Error connecting to VNC server <localhost:52153>: IO::Socket::INET: connect: Connection refused|grenache-1
8090381|2022-02-03 00:14:23|done|incomplete|online_sles15sp3_pscc_basesys-srv_def_full_zdup|backend died: Error connecting to VNC server <localhost:53001>: IO::Socket::INET: connect: Connection refused|grenache-1
8090374|2022-02-03 00:08:38|done|incomplete|offline_sles15sp2_ltss_pscc_base_all_minimal|backend died: Error connecting to VNC server <localhost:55321>: IO::Socket::INET: connect: Connection refused|grenache-1
8090295|2022-02-03 00:08:07|done|incomplete|offline_sles12sp3_ltss_media_tcm-wsm_def_full|backend died: Error connecting to VNC server <localhost:41125>: IO::Socket::INET: connect: Connection refused|grenache-1
8090371|2022-02-02 23:57:48|done|incomplete|online_sles15sp2_ltss_pscc_basesys-srv-desk-dev_def_full_y|backend died: Error connecting to VNC server <localhost:37459>: IO::Socket::INET: connect: Connection refused|grenache-1
8089677|2022-02-02 23:56:52|done|incomplete|textmode|backend died: Error connecting to VNC server <localhost:60019>: IO::Socket::INET: connect: Connection refused|grenache-1
8089675|2022-02-02 23:54:00|done|incomplete|default|backend died: Error connecting to VNC server <localhost:39781>: IO::Socket::INET: connect: Connection refused|openqaworker2
8090368|2022-02-02 23:51:31|done|incomplete|online_sles15sp2_ltss_pscc_basesys-srv-wsm-contm_all_full_y|backend died: Error connecting to VNC server <localhost:44857>: IO::Socket::INET: connect: Connection refused|grenache-1
8089739|2022-02-02 23:34:39|done|incomplete|prj4_guest_upgrade_sles15sp3_on_developing-xen|backend died: Error connecting to VNC server <localhost:33053>: IO::Socket::INET: connect: Connection refused|grenache-1
8090294|2022-02-02 23:30:54|done|incomplete|offline_sles12sp3_ltss_pscc_asmm-lgm_all_full|backend died: Error connecting to VNC server <localhost:49109>: IO::Socket::INET: connect: Connection refused|grenache-1
8089448|2022-02-02 23:19:38|done|incomplete|uefi-gi-guest_developing-on-host_developing-kvm|backend died: Error connecting to VNC server <localhost:32997>: IO::Socket::INET: connect: Connection refused|grenache-1
8089774|2022-02-02 22:48:45|done|incomplete|gi-guest_sles15sp3-on-host_developing-kvm|backend died: Error connecting to VNC server <localhost:60419>: IO::Socket::INET: connect: Connection refused|grenache-1
8089207|2022-02-02 22:37:06|done|incomplete|allmodules+allpatterns-gnome|backend died: Error connecting to VNC server <10.161.145.85:5901>: IO::Socket::INET: connect: Connection refused|grenache-1
8088389|2022-02-02 22:32:45|done|incomplete|gi-guest_developing-on-host_developing-kvm|backend died: Error connecting to VNC server <localhost:46107>: IO::Socket::INET: connect: Connection refused|grenache-1
8089378|2022-02-02 22:22:38|done|incomplete|gi-guest_win2019-on-host_developing-xen|backend died: Error connecting to VNC server <localhost:49681>: IO::Socket::INET: connect: Connection refused|grenache-1
8089780|2022-02-02 22:19:48|done|incomplete|virt-guest-migration-developing-from-developing-to-developing-xen-src|backend died: Error connecting to VNC server <localhost:46003>: IO::Socket::INET: connect: Connection refused|openqaworker2
8089778|2022-02-02 22:18:25|done|incomplete|virt-guest-migration-developing-from-developing-to-developing-kvm-src|backend died: Error connecting to VNC server <localhost:42537>: IO::Socket::INET: connect: Connection refused|openqaworker2
8089769|2022-02-02 22:17:27|done|incomplete|gi-guest_developing-on-host_developing-xen|backend died: Error connecting to VNC server <localhost:47373>: IO::Socket::INET: connect: Connection refused|grenache-1
8088147|2022-02-02 22:12:32|done|incomplete|migration_offline_dvd_sle15sp1_ltss_ha_alpha_node01|backend died: Error connecting to VNC server <localhost:56609>: IO::Socket::INET: connect: Connection refused|grenache-1
8088144|2022-02-02 22:12:10|done|incomplete|migration_offline_dvd_sle15sp2_ha_alpha_node02|backend died: Error connecting to VNC server <localhost:60737>: IO::Socket::INET: connect: Connection refused|grenache-1
8089300|2022-02-02 22:11:55|done|incomplete|ssh-X|backend died: Error connecting to VNC server <localhost:39093>: IO::Socket::INET: connect: Connection refused|openqaworker2
8088150|2022-02-02 22:11:46|done|incomplete|migration_offline_dvd_sle12sp4_ltss_ha_alpha_node02|backend died: Error connecting to VNC server <localhost:57655>: IO::Socket::INET: connect: Connection refused|grenache-1
8088143|2022-02-02 22:11:45|done|incomplete|migration_offline_dvd_sle15sp3_ha_alpha_node01|backend died: Error connecting to VNC server <localhost:50937>: IO::Socket::INET: connect: Connection refused|grenache-1
8088177|2022-02-02 22:09:37|done|incomplete|migration_offline_dvd_sle12sp4_ltss_ha_alpha_node01|backend died: Error connecting to VNC server <localhost:35093>: IO::Socket::INET: connect: Connection refused|grenache-1
8088176|2022-02-02 22:09:18|done|incomplete|migration_offline_dvd_sle15sp3_ha_alpha_node02|backend died: Error connecting to VNC server <localhost:56829>: IO::Socket::INET: connect: Connection refused|grenache-1
8088175|2022-02-02 22:08:17|done|incomplete|migration_offline_dvd_sle15sp2_ha_alpha_node01|backend died: Error connecting to VNC server <localhost:41807>: IO::Socket::INET: connect: Connection refused|grenache-openqa-query-for-job-label poo#76813
so multiple machines, many jobs reproduce this every day.
Updated by livdywan over 2 years ago
- Subject changed from [tools] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused" to [tools] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused" size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by okurz over 2 years ago
- Subject changed from [tools] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused" size:M to [tools] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused"
- Category deleted (
Bugs in existing tests) - Status changed from Workable to New
- Target version deleted (
Ready)
Actually we agreed in the estimation meeting that we did not manage to estimate it together so removing the estimate and workable status again.
Updated by okurz over 2 years ago
- Copied to action #105882: Test using svirt backend fails with auto_review:"Error connecting to VNC server.*localhost.*Connection refused" added
Updated by okurz over 2 years ago
- Project changed from openQA Tests to openQA Project
- Description updated (diff)
- Category set to Feature requests
- Target version set to Ready
Updated by okurz over 2 years ago
- Copied to action #106685: Test using svirt backend incomplete with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection timed out":retry added
Updated by openqa_review over 2 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: autoyast_reinstall
https://openqa.suse.de/tests/8180537
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Updated by mkittler over 2 years ago
- Assignee set to mkittler
Looks like the OS is supposed to provide this VNC server as configured via the kernel command line:
<os>
<type>hvm</type>
<initrd>/var/lib/libvirt/images/openQA-SUT-3.initrd</initrd>
<kernel>/var/lib/libvirt/images/openQA-SUT-3.kernel</kernel>
<cmdline>ifcfg=dhcp install=ftp://openqa.suse.de/SLE-15-SP4-Online-s390x-Build98.1-Media1 sshd=1 VNC=1 VNCSize=1024x768 VNCPassword=nots3cr3t sshpassword=nots3cr3t autoyast=http://10.162.6.237:20333/NaB1iG65KSvSwjkj/assets/other/08180440-autoinst.xml SetHostname=0 regurl=http://all-98.1.proxy.scc.suse.de</cmdline>
</os>
[ 0.175007][ T0] Kernel command line: ifcfg=dhcp install=ftp://openqa.suse.de/SLE-15-SP4-Online-s390x-Build98.1-Media1 sshd=1 VNC=1 VNCSize=1024x768 VNCPassword=nots3cr3t sshpassword=nots3cr3t autoyast=http://10.162.6.237:20333/NaB1iG65KSvSwjkj/assets/other/08180440-autoinst.xml SetHostname=0 regurl=http://all-98.1.proxy.scc.suse.de
[ 0.175082][ T0] Unknown kernel command line parameters "ifcfg=dhcp install=ftp://openqa.suse.de/SLE-15-SP4-Online-s390x-Build98.1-Media1 sshd=1 VNC=1 VNCSize=1024x768 VNCPassword=nots3cr3t sshpassword=nots3cr3t autoyast=http://10.162.6.237:20333/NaB1iG65KSvSwjkj/assets/other/08180440-autoinst.xml SetHostname=0 regurl=http://all-98.1.proxy.scc.suse.de", will be passed to user space.
And it looks like it would actually start that VNC server at some point:
Loading Installation System (6/6) - 0% 34% 60%100%
BAD PASSWORD: it is based on a dictionary word
starting rsyslogd (logging to /dev/tty4)... ok
starting klogd... ok
starting nscd... ok
setting temporary root password to 'nots3cr3t'
BAD PASSWORD: it is based on a dictionary word
ssh-keygen: generating new host keys: RSA DSA ECDSA ED25519
Starting SSH daemon... ok
IP addresses:
10.161.145.82
starting slpd to announce VNC...
starting yast...
Starting Installer
%GProbing connected terminal...
78
Initializing virtual console...
Found an unknown terminal on /dev/console (80 columns x 24 lines).
Sampling every 5 s to /var/log/YaST2/memsample.zcat
starting VNC server...
A log file will be written to: /var/log/YaST2/vncserver.log ...
***
*** You can connect to <host>, display :1 now with vncviewer
*** Or use a Java capable browser on http://<host>:5801/
***
(When YaST2 is finished, close your VNC viewer and return to this window.)
Active interfaces:
eth0: 52:54:00:45:1d:43
10.161.145.82/20
fe80::5054:ff:fe45:1d43/64
*** Starting YaST2 ***
Then it looks like it is able to connect (still within the bootloader_start module):
[2022-02-17T11:26:12.837010+01:00] [debug] >>> testapi::wait_serial: Starting YaST2 : ok
[2022-02-17T11:26:12.837241+01:00] [debug] tests/installation/bootloader_start.pm:48 called bootloader_zkvm::run -> tests/installation/bootloader_zkvm.pm:86 called testapi::select_console
[2022-02-17T11:26:12.837362+01:00] [debug] <<< testapi::select_console(testapi_console="installation")
[2022-02-17T11:26:12.838539+01:00] [debug] Establishing VNC connection to 10.161.145.82:5901
[2022-02-17T11:26:12.918091+01:00] [debug] led state 0 0 0 -261
[2022-02-17T11:26:12.926806+01:00] [debug] activate_console, console: installation, type:
[2022-02-17T11:26:12.926912+01:00] [debug] activate_console called with generic type, no action
[2022-02-17T11:26:12.927185+01:00] [debug] tests/installation/bootloader_start.pm:48 called bootloader_zkvm::run -> tests/installation/bootloader_zkvm.pm:86 called testapi::select_console -> lib/susedistribution.pm:904 called testapi::assert_screen
[2022-02-17T11:26:12.927336+01:00] [debug] <<< testapi::assert_screen(mustmatch="installation", no_wait=1, timeout=30)
Then a check screen takes quite long and a VNC stall is detected:
[2022-02-17T11:29:45.752887+01:00] [warn] !!! backend::baseclass::check_asserted_screen: check_asserted_screen took 12.92 seconds for 134 candidate needles - make your needles more specific
[2022-02-17T11:29:45.753026+01:00] [debug] no match: 25.0s, best candidate: manual-intervention-20210719 (0.60)
[2022-02-17T11:29:45.759412+01:00] [warn] !!! backend::baseclass::do_capture: There is some problem with your environment, we detected a stall for 12.9358730316162 seconds
[2022-02-17T11:29:46.955969+01:00] [warn] !!! backend::baseclass::check_asserted_screen: check_asserted_screen took 1.19 seconds for 134 candidate needles - make your needles more specific
[2022-02-17T11:29:46.956108+01:00] [debug] no match: 12.0s, best candidate: manual-intervention-20210719 (0.56)
[2022-02-17T11:30:00.022348+01:00] [warn] !!! backend::baseclass::check_asserted_screen: check_asserted_screen took 13.06 seconds for 134 candidate needles - make your needles more specific
[2022-02-17T11:30:00.022498+01:00] [debug] no match: 10.8s, best candidate: manual-intervention-20210719 (0.60)
[2022-02-17T11:30:00.026329+01:00] [warn] !!! backend::baseclass::do_capture: There is some problem with your environment, we detected a stall for 13.0687019824982 seconds
[2022-02-17T11:30:13.001954+01:00] [warn] !!! backend::baseclass::check_asserted_screen: check_asserted_screen took 12.97 seconds for 134 candidate needles - make your needles more specific
[2022-02-17T11:30:13.002110+01:00] [debug] no match: -2.3s, best candidate: manual-intervention-20210719 (0.60)
[2022-02-17T11:30:13.292796+01:00] [debug] >>> testapi::_check_backend_response: match=autoyast-boot,autoyast-error,autoyast-stage1-reboot-upcoming,bios-boot,import-untrusted-gpg-key,linuxrc-install-fail,manual-intervention,nonexisting-package,nvidia-validation-failed,package-notification,reboot-after-installation,scc-invalid-url,warning-pop-up timed out after 50 (check_screen)
[2022-02-17T11:30:13.327518+01:00] [warn] !!! testapi::_check_backend_response: stall detected during check_screen failure!
[2022-02-17T11:30:13.338752+01:00] [debug] tests/autoyast/installation.pm:178 called installation::verify_timeout_and_check_screen -> tests/autoyast/installation.pm:103 called testapi::check_screen
[2022-02-17T11:30:13.339084+01:00] [debug] <<< testapi::check_screen(mustmatch=[
"bios-boot",
"nonexisting-package",
"reboot-after-installation",
"linuxrc-install-fail",
"scc-invalid-url",
"warning-pop-up",
"autoyast-boot",
"package-notification",
"nvidia-validation-failed",
"import-untrusted-gpg-key",
"autoyast-error",
"autoyast-stage1-reboot-upcoming",
"manual-intervention"
], timeout=50)
[2022-02-17T11:30:15.389680+01:00] [warn] !!! backend::baseclass::check_asserted_screen: check_asserted_screen took 2.05 seconds for 134 candidate needles - make your needles more specific
[2022-02-17T11:30:15.389822+01:00] [debug] no match: 49.9s, best candidate: manual-intervention-20210719 (0.56)
This happens multiple times:
[2022-02-17T11:36:41.374822+01:00] [warn] !!! backend::baseclass::do_capture: There is some problem with your environment, we detected a stall for 12.8661558628082 seconds
Then we see log messages indicating a reboot - so the system was actually just rebooting (which would explain the VNC stall):
[ 3.058782][ T1] systemd[1]: Detected virtualization kvm.
[ 3.058787][ T1] systemd[1]: Detected architecture s390x.
Welcome to SUSE Linux Enterprise Server 15 SP4!
It also tries to start the VNC server again. However, that doesn't work because the address is already in use:
***
*** Please return to your X-Server screen to finish installation
***
starting VNC server...
A log file will be written to: /var/log/YaST2/vncserver.log ...
***
*** You can connect to <host>, display :1 now with vncviewer
*** Or use a Java capable browser on http://<host>:5801/
***
(When YaST2 is finished, close your VNC viewer and return to this window.)
Active interfaces:
eth0: 52:54:00:45:1d:43
10.161.145.82/20
fe80::5054:ff:fe45:1d43/64
*** Starting YaST2 ***
2022-02-17T05:41:40.397594-05:00 susetest systemd[6068]: xvnc.socket: Failed to create listening socket ([::]:5901): Address already in use
[FAILED] Failed to listen on Xvnc Server.
2022-02-17T05:41:40.407537-05:00 susetest systemd[1]: xvnc.socket: Failed to receive listening socket ([::]:5901): Input/output error
2022-02-17T05:41:40.407645-05:00 susetest systemd[1]: Failed to listen on Xvnc Server.
removed '/root/.vnc/passwd.yast'
%@
Welcome to SUSE Linux Enterprise Server 15 SP4 Snapshot-202202-3 (s390x) - Kernel 5.14.21-150400.7-default (ttysclp0).
That os-autoinst fails to re-connect is therefore no surprise:
[2022-02-17T11:38:42.215876+01:00] [debug] <<< testapi::select_console(testapi_console="x11", await_console=0)
[2022-02-17T11:38:42.216437+01:00] [debug] Establishing VNC connection to 10.161.145.82:5901
[2022-02-17T11:38:44.221660+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.82:5901>: IO::Socket::INET: connect: No route to host
[2022-02-17T11:38:45.224877+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.82:5901>: IO::Socket::INET: connect: No route to host
[2022-02-17T11:38:46.226871+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.82:5901>: IO::Socket::INET: connect: No route to host
[2022-02-17T11:38:47.228471+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.82:5901>: IO::Socket::INET: connect: No route to host
[2022-02-17T11:38:48.231267+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.82:5901>: IO::Socket::INET: connect: No route to host
[2022-02-17T11:38:49.232907+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.82:5901>: IO::Socket::INET: connect: No route to host
[2022-02-17T11:38:50.234302+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.82:5901>: IO::Socket::INET: connect: No route to host
[2022-02-17T11:38:51.236243+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.82:5901>: IO::Socket::INET: connect: No route to host
[2022-02-17T11:38:52.365064+01:00] [info] ::: basetest::runtest: # Test died: Error connecting to VNC server <10.161.145.82:5901>: IO::Socket::INET: connect: No route to host at /usr/lib/os-autoinst/testapi.pm line 1763.
testapi::select_console("x11", "await_console", 0) called at sle/lib/utils.pm line 1482
utils::reconnect_mgmt_console("timeout", 700, "grub_timeout", 180) called at sle/tests/autoyast/installation.pm line 323
installation::run(installation=HASH(0x1000d4c3c18)) called at /usr/lib/os-autoinst/basetest.pm line 360
eval {...} called at /usr/lib/os-autoinst/basetest.pm line 354
basetest::runtest(installation=HASH(0x1000d4c3c18)) called at /usr/lib/os-autoinst/autotest.pm line 372
eval {...} called at /usr/lib/os-autoinst/autotest.pm line 372
autotest::runalltests() called at /usr/lib/os-autoinst/autotest.pm line 242
eval {...} called at /usr/lib/os-autoinst/autotest.pm line 242
autotest::run_all() called at /usr/lib/os-autoinst/autotest.pm line 296
autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0x1000e24ea88)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0x1000e24ea88), CODE(0x1000f5b1de0)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 488
Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0x1000e24ea88)) called at /usr/lib/os-autoinst/autotest.pm line 298
I've also checked some other jobs and they look similar.
Updated by mkittler over 2 years ago
Strangely, the last good job (https://openqa.suse.de/tests/8014876) has the same error:
starting VNC server...
A log file will be written to: /var/log/YaST2/vncserver.log ...
***
*** You can connect to <host>, display :1 now with vncviewer
*** Or use a Java capable browser on http://<host>:5801/
***
(When YaST2 is finished, close your VNC viewer and return to this window.)
Active interfaces:
eth0: 52:54:00:84:f8:77
10.161.145.86/20
fe80::5054:ff:fe84:f877/64
*** Starting YaST2 ***
2022-01-21T15:48:14.257648-05:00 susetest systemd[6058]: xvnc.socket: Failed to create listening socket ([::]:5901): Address already in use
[FAILED] Failed to listen on Xvnc Server.
2022-01-21T15:48:14.268804-05:00 susetest systemd[1]: xvnc.socket: Failed to receive listening socket ([::]:5901): Input/output error
2022-01-21T15:48:14.268915-05:00 susetest systemd[1]: Failed to listen on Xvnc Server.
removed '/root/.vnc/passwd.yast'
%@
Welcome to SUSE Linux Enterprise Server 15 SP4 Beta3-202201 (s390x) - Kernel 5.14.21-150400.4-default (ttysclp0).
So this is likely a wrong lead. In the good run, the VNC connection can be established despite this error:
eth0: 10.161.145.86 fe80::5054:ff:fe84:f877
susetest login: [2022-01-21T21:45:22.857260+01:00] [debug] >>> testapi::wait_serial: (?^u:Welcome to SUSE Linux Enterprise .*\(s390x\)): ok
[2022-01-21T21:45:22.858722+01:00] [debug] tests/autoyast/installation.pm:323 called utils::reconnect_mgmt_console -> lib/utils.pm:1453 called testapi::select_console
[2022-01-21T21:45:22.858907+01:00] [debug] <<< testapi::select_console(testapi_console="x11", await_console=0)
[2022-01-21T21:45:22.859517+01:00] [debug] Establishing VNC connection to 10.161.145.86:5901
[2022-01-21T21:45:24.077348+01:00] [debug] led state 0 0 0 -261
[2022-01-21T21:45:24.086722+01:00] [debug] activate_console, console: x11, type:
[2022-01-21T21:45:24.086846+01:00] [debug] activate_console called with generic type, no action
[2022-01-21T21:45:24.211842+01:00] [debug] ||| finished installation autoyast (runtime: 821 s)
[2022-01-21T21:45:24.229569+01:00] [debug] ||| starting console tests/autoyast/console.pm
[2022-01-21T21:45:24.230706+01:00] [debug] tests/autoyast/console.pm:18 called testapi::select_console
[2022-01-21T21:45:24.230833+01:00] [debug] <<< testapi::select_console(testapi_console="root-console")
[2022-01-21T21:45:24.240771+01:00] [debug] Establishing VNC connection to localhost:56165
[2022-01-21T21:45:25.393453+01:00] [debug] Connected to Xvnc - PID 675439
icewm PID is 675520
[2022-01-21T21:45:26.402275+01:00] [debug] Wait for SSH on host 10.161.145.86 (timeout: 120)
xterm PID is 675522
[2022-01-21T21:45:26.466387+01:00] [debug] led state 0 0 0 -261
[2022-01-21T21:45:26.474955+01:00] [debug] activate_console, console: root-console, type: console
[2022-01-21T21:45:26.475086+01:00] [debug] backend ipmi || spvm || pvm_hmc || s390x || zkvm
[2022-01-21T21:45:26.475306+01:00] [debug] tests/autoyast/console.pm:18 called testapi::select_console -> lib/susedistribution.pm:748 called susedistribution::handle_password_prompt -> lib/susedistribution.pm:48 called testapi::assert_screen
[2022-01-21T21:45:26.475465+01:00] [debug] <<< testapi::assert_screen(mustmatch="password-prompt", timeout=60)
Updated by mkittler over 2 years ago
Maybe the failing VNC connection isn't used anyways? In other words: Maybe in the good case the lingering VNC server is lingering long enough to allow the re-connect but actually the test does not need this re-connect attempt anyways? This would make sense considering it is also establishing a connection to the "worker-local" Xvnc server (and supposedly uses that VNC server instead of the SUT-provided one). This would mean that the test should simply remove the VNC console if it isn't used anymore.
Even if that's not the case I'm wondering what we can do from os-autoinst. If the test really needs this SUT-provided VNC server the itself test must somehow ensure that it can be started.
Updated by mkittler over 2 years ago
I overlooked one thing: The VNC port differs. The VNC-server in the SUT is using port 5801 but the connection attempt is to port 5901. The port also differs in the good case. So am I right to assume that this has nothing to do with the SUT-provided VNC after all?
Updated by mkittler over 2 years ago
I suppose the 10.161.145.x
IP is the svrit jumphost, considering that the x86_64 jobs actually use a domain instead (openqaw5-xen.qa.suse.de
) which is definitely the svirt jumphost.
So this is mainly a problem for svirt/x390x and svirt/xen/x86_64:
openqa=> select id, t_finished, test, (select value from job_settings where job_id = jobs.id and key = 'BACKEND') as backend, arch, reason from jobs where reason like '%Error connecting to VNC server%<10%refused%' order by id desc limit 40;
id | t_finished | test | backend | arch | reason
---------+---------------------+--------------------------------------------------------------------------+---------+-------+------------------------------------------------------------------------------------------------------------------
8226548 | 2022-02-25 01:35:39 | qam-gnome | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.95:5901>: IO::Socket::INET: connect: Connection refused
8220467 | 2022-02-24 01:49:45 | qam-gnome | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.99:5901>: IO::Socket::INET: connect: Connection refused
8194332 | 2022-02-19 01:26:11 | qam-gnome | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.93:5901>: IO::Socket::INET: connect: Connection refused
8158377 | 2022-02-14 10:39:51 | minimal_x | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.99:5901>: IO::Socket::INET: connect: Connection refused
8127731 | 2022-02-09 20:09:13 | offline_sles12sp5_pscc_base_all_minimal:investigate:last_good_build:81.1 | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
8118684 | 2022-02-08 10:48:37 | qam-gnome | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.97:5901>: IO::Socket::INET: connect: Connection refused
8089207 | 2022-02-02 22:37:06 | allmodules+allpatterns-gnome | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.85:5901>: IO::Socket::INET: connect: Connection refused
7983693 | 2022-01-17 10:49:23 | migration_offline_scc_sle12sp4_ltss_ha_alpha_node02 | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.80:5901>: IO::Socket::INET: connect: Connection refused
7911848 | 2021-12-27 10:51:45 | offline_sles15sp2_ltss_media_basesys-srv-desk-dev_def_full | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.90:5901>: IO::Socket::INET: connect: Connection refused
7909289 | 2021-12-27 02:59:43 | offline_sles15sp3_media_basesys-srv-desk-dev-contm-lgm-tsm-wsm_all_full | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.92:5901>: IO::Socket::INET: connect: Connection refused
7898028 | 2021-12-23 08:39:05 | offline_sles12sp5_media_sdk-asmm-contm-lgm-tcm-wsm_all_full | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.99:5901>: IO::Socket::INET: connect: Connection refused
(11 Zeilen)
openqa=> select id, t_finished, test, (select value from job_settings where job_id = jobs.id and key = 'BACKEND') as backend, arch, reason from jobs where reason like '%Error connecting to VNC server%<op%refused%' order by id desc limit 40;
id | t_finished | test | backend | arch | reason
---------+---------------------+-----------------------------------------------------------------------+---------+--------+------------------------------------------------------------------------------------------------------------------------------
8208649 | 2022-02-22 08:13:41 | jeos-containers | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8208647 | 2022-02-22 08:13:29 | jeos-containers | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
8204491 | 2022-02-21 15:46:47 | jeos-extratest@jlausuch/os-autoinst-distri-opensuse#suseconnect_retry | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8202950 | 2022-02-21 15:46:48 | jeos-extratest:investigate:last_good_build:20220220-1 | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
8167219 | 2022-02-15 01:36:31 | jeos-main | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
8164693 | 2022-02-14 21:48:08 | ext4_yast@jknphy/os-autoinst-distri-opensuse#poc_enh_yaml_sch | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
8164561 | 2022-02-14 21:44:01 | ext4_yast@jknphy/os-autoinst-distri-opensuse#poc_enh_yaml_sch | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8149970 | 2022-02-11 23:54:22 | jeos-containers | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
8149146 | 2022-02-11 23:55:17 | jeos-containers:investigate:last_good_build:20220210-1 | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
8124320 | 2022-02-09 09:18:43 | jeos-main | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
8122027 | 2022-02-09 09:18:29 | jeos-extratest | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
8117896 | 2022-02-08 11:09:01 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8111540 | 2022-02-06 10:59:41 | jeos-containers | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
8110996 | 2022-02-06 10:59:23 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8090209 | 2022-02-03 02:15:10 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8034373 | 2022-01-26 17:36:50 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
8034370 | 2022-01-26 17:36:46 | jeos-containers | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8033344 | 2022-01-26 09:55:19 | jeos-main | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen-1.qa.suse.de:5919>: IO::Socket::INET: connect: Connection refused
8031298 | 2022-01-26 11:46:24 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
8031279 | 2022-01-26 11:46:36 | jeos-extratest | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8030832 | 2022-01-26 00:42:56 | jeos-fips | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
8030822 | 2022-01-26 00:42:56 | jeos-containers | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8007455 | 2022-01-21 07:59:18 | jeos-extratest | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8007453 | 2022-01-21 07:59:05 | jeos-extratest | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
7996983 | 2022-01-19 04:28:29 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
7995973 | 2022-01-18 21:29:31 | lvm+RAID1 | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
7995972 | 2022-01-18 21:29:31 | ext4 | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
7983670 | 2022-01-17 09:11:06 | jeos-main | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
7983666 | 2022-01-17 09:06:58 | jeos-fs_stress | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
7983665 | 2022-01-17 09:06:09 | jeos-fips | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
7983664 | 2022-01-17 09:06:04 | jeos-fips | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
7983663 | 2022-01-17 09:03:47 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
7983661 | 2022-01-17 09:03:48 | jeos-containers | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
7983660 | 2022-01-17 09:03:47 | jeos-containers | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
7977963 | 2022-01-15 21:58:10 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
7977962 | 2022-01-15 21:58:11 | jeos-fs_stress | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
7977961 | 2022-01-15 21:57:41 | jeos-main | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
7977949 | 2022-01-15 21:55:25 | jeos-fips | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
7977948 | 2022-01-15 21:42:21 | jeos-fips | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
7977944 | 2022-01-15 21:46:14 | jeos-containers | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
(40 Zeilen)
That means my previous conclusion that it is about the SUT-provided VNC connection is wrong (also because that VNC server runs on a different port anyways). That also explains why the error about it (that also shows up in passing jobs) is not relevant here.
So either the svirt host isn't reachable anymore in these cases or the virtualization tool (I suppose QEMU in case of s390s and xen in case of x86_64) stopped the VNC server (maybe as part of stopping the VM itself).
Updated by mkittler over 2 years ago
Not all those failures look the same if you have a close look at the logs. E.g. in https://openqa.suse.de/tests/8226548 the connection stops working with a VNC protocol error during the installation:
[2022-02-25T02:21:17.477183+01:00] [warn] !!! backend::baseclass::check_asserted_screen: check_asserted_screen took 0.62 seconds for 124 candidate needles - make your needles more specific
[2022-02-25T02:21:17.477325+01:00] [debug] no match: 19.2s, best candidate: manual-intervention2-20170718 (0.37)
[2022-02-25T02:34:59.441635+01:00] [warn] !!! consoles::VNC::catch {...} : Error in VNC protocol - relogin: short read for zrle data 893 - 1524
[2022-02-25T02:35:33.553842+01:00] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
Error connecting to VNC server <10.161.145.95:5901>: IO::Socket::INET: connect: Connection refused
[2022-02-25T02:35:33.554512+01:00] [debug] Closing SSH serial connection with s390zp19.suse.de
In the case from #76813#note-34 it looks more like the VNC connection is re-established after a stall was detected.
Btw, in the logs of svirt/x86_64/xen job https://openqa.suse.de/tests/8208649 we can clearly see that the VNC server is supposed to be provided by libvirt/xen as the corresponding config is logged:
<graphics type="vnc" port="5902" autoport="no" listen="0.0.0.0" sharePolicy="force-shared" passwd="nots3cr3t">
<listen type="address" address="0.0.0.0"/>
</graphics>
In this case it is also definitely the VNC stall detection which triggers the re-connect:
[2022-02-22T09:13:13.523577+01:00] [debug] <<< testapi::type_string(string="(echo 1spU0; bash -eox pipefail /tmp/script1spU0.sh ; echo SCRIPT_FINISHED1spU0-\$?-) | tee /dev/ttyS0\n", max_interval=250, wait_screen_changes=0, wait_still_screen=0, timeout=30, similarity_level=47)
[2022-02-22T09:13:17.386004+01:00] [debug] tests/jeos/firstrun.pm:184 called firstrun::verify_mounts -> tests/jeos/firstrun.pm:54 called testapi::script_output
[2022-02-22T09:13:17.386286+01:00] [debug] <<< testapi::wait_serial(record_output=1, timeout=90, quiet=undef, buffer_size=undef, no_regex=0, regexp="SCRIPT_FINISHED1spU0-\\d+-", expect_not_found=0)
[2022-02-22T09:13:17.880381+01:00] [debug] considering VNC stalled, no update for 4.31 seconds
[2022-02-22T09:13:19.884068+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2022-02-22T09:13:20.885054+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2022-02-22T09:13:21.886534+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2022-02-22T09:13:22.887663+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2022-02-22T09:13:23.889156+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2022-02-22T09:13:24.890710+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2022-02-22T09:13:25.891860+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2022-02-22T09:13:26.893287+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2022-02-22T09:13:27.898708+01:00] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2022-02-22T09:13:27.899420+01:00] [debug] Closing SSH serial connection with openqaw5-xen.qa.suse.de
[2022-02-22T09:13:27.899830+01:00] [debug] Destroying openQA-SUT-2 virtual machine
The logs of svirt/s390x don't show such a VNC configuration on VM-level (there is only a cmdline for spawning a VNC server within the SUT but that's not the problematic VNC server here). E.g. that's the entire virsh config for https://openqa.suse.de/tests/8226548:
<domain type='kvm' id='47356'>
<name>openQA-SUT-6</name>
<uuid>d6166688-f8d6-4b34-be45-054500996ff8</uuid>
<description>openQA WebUI: openqa.suse.de (6): 08226548-sle-12-SP5-Server-DVD-Updates-s390x-Build20220225-1-qam-gnome@s390x-kvm-sle12</description>
<memory unit='KiB'>2097152</memory>
<currentMemory unit='KiB'>2097152</currentMemory>
<vcpu placement='static'>1</vcpu>
<resource>
<partition>/machine</partition>
</resource>
<os>
<type arch='s390x' machine='s390-ccw-virtio-3.1'>hvm</type>
<kernel>/var/lib/libvirt/images/openQA-SUT-6.kernel</kernel>
<initrd>/var/lib/libvirt/images/openQA-SUT-6.initrd</initrd>
<cmdline>ifcfg=dhcp install=ftp://openqa.suse.de/SLE-12-SP5-Server-DVD-s390x-GM-DVD1 sshd=1 VNC=1 VNCSize=1024x768 VNCPassword=nots3cr3t sshpassword=nots3cr3t autoyast=http://10.162.6.237:20453/_nIfx2cAt1O6Ae9N/files/autoyast_qam/12-SP5_installation.xml SetHostname=0 regurl=https://scc.suse.com</cmdline>
<boot dev='hd'/>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/bin/qemu-system-s390x</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='unsafe'/>
<source file='/var/lib/libvirt/images/openQA-SUT-6a.img'/>
<backingStore/>
<target dev='vda' bus='virtio'/>
<alias name='virtio-disk0'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0000'/>
</disk>
<controller type='pci' index='0' model='pci-root'>
<alias name='pci.0'/>
</controller>
<controller type='virtio-serial' index='0'>
<alias name='virtio-serial0'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0002'/>
</controller>
<interface type='direct'>
<mac address='52:54:00:55:e3:d2'/>
<source dev='eth0' mode='bridge'/>
<target dev='macvtap4'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0001'/>
</interface>
<console type='pty' tty='/dev/pts/2'>
<source path='/dev/pts/2'/>
<target type='sclp' port='0'/>
<alias name='console0'/>
</console>
<console type='pty'>
<source path='/dev/pts/22'/>
<target type='virtio' port='1'/>
<alias name='console1'/>
</console>
<memballoon model='virtio'>
<alias name='balloon0'/>
<address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0003'/>
</memballoon>
<panic model='s390'/>
</devices>
</domain>
That makes also sense as I couldn't found an add_vnc
call in s390s test code but leaves me wondering where the VNC server on 10.161.145.95:5901
is configured. It definitely exists as the first connection to it can be established:
[2022-02-25T02:17:10.371517+01:00] [debug] Establishing VNC connection to 10.161.145.95:5901
[2022-02-25T02:17:10.455881+01:00] [debug] activate_console, console: installation, type:
[2022-02-25T02:17:10.455973+01:00] [debug] activate_console called with generic type, no action
[2022-02-25T02:17:10.456175+01:00] [debug] tests/installation/bootloader_start.pm:48 called bootloader_zkvm::run -> tests/installation/bootloader_zkvm.pm:86 called testapi::select_console -> lib/susedistribution.pm:904 called testapi::assert_screen
[2022-02-25T02:17:10.456287+01:00] [debug] <<< testapi::assert_screen(mustmatch="installation", timeout=30, no_wait=1)
[2022-02-25T02:17:11.282043+01:00] [debug] >>> testapi::_handle_found_needle: found inst-console-20191204, similarity 1.00 @ 71/49
It is definitely not the local XVnc because that one is also established (way before):
[2022-02-25T02:15:52.977641+01:00] [debug] Establishing VNC connection to localhost:44977
[2022-02-25T02:15:53.118048+01:00] [debug] Connected to Xvnc - PID 142729
It is also not the SUT provided one because that one uses a different port:
starting VNC server...
A log file will be written to: /var/log/YaST2/vncserver.log ...
***
*** You can connect to <host>, display :1 now with vncviewer
*** Or use a Java capable browser on http://<host>:5801/
I suspect in the s390x case the VNC server on port 5901 is simply present by default and added by the not svirt/s390x specific code in opensuse/lib/susedistribution.pm
(where 5901 is hard-coded).
So likely all tests which use the svirt backend's add_vnc
function to access the VM's VNC are affected (done in bootloader_svirt
module) and svirt/s390x tests where the VNC console is configured in more mysterious ways.
Now that it is somewhat clear which VNC server breaks during the test execution (the one configured on VM-level) the question remains why that happens. The VNC server must be broken/unreachable for more than 10 seconds because our code tries to re-connect for 10 seconds.
It seems to happen quite randomly, except for the s390 cases where it happens often in the installation
module (e.g. https://openqa.suse.de/tests/8226548 and previous jobs).
There's special code within the VNC backend to destroy the VM in case of a reboot. (You can see the related XML in the config above.) However, this code is actually supposed to fix this problem and since the problem doesn't seem to correspond always to restarts it is unlikely to be the cause.
Updated by mkittler over 2 years ago
As next steps we could:
- Determine the fail rate: https://progress.opensuse.org/projects/openqatests/wiki/Wiki#Statistical-investigation
- Adjust the timeout (of 10 seconds) for the VNC connection to see whether increasing the timeout helps and whether the issue can be easier reproduced with a lower timeout.
- The VNC console is selected/added here:
- We could try to invoke carp when the console is instantiated to be sure about the exact place.
Updated by mkittler over 2 years ago
I scheduled a bunch of s390x jobs: for i in {01..50} ; do openqa-clone-job --within-instance https://openqa.suse.de --skip-chained-deps 8226548 TEST=$USER_poo76813_s390x_$i BUILD=poo76813_s390x_investigation _GROUP=0 ; done
-> https://openqa.suse.de/tests/overview?build=poo76813_s390x_investigation&version=12-SP5&distri=sle
And a bunch of xen jobs: for i in {01..50} ; do openqa-clone-job --within-instance https://openqa.suse.de --skip-chained-deps 8208649 TEST=$USER_poo76813_xen_$i BUILD=poo76813_xen_investigation _GROUP=0 ; done
-> https://openqa.suse.de/tests/overview?distri=sle&build=poo76813_xen_investigation&version=15-SP4
Updated by mkittler over 2 years ago
All xen jobs passed. So it doesn't happen very often in that case - at least not during the time frame the test runs have been executed. All x390 tests failed in the same way but due to a different/unrelated issue. So the s390x runs didn't provide any new information.
Maybe it isn't the best approach to catch random network problems (which might be the cause for this). I guess I'll just increase the timeout for now.
Updated by mkittler over 2 years ago
I suppose finding out where in the test distribution the timeout needed to be specified isn't worth it considering that it would be nice if os-autoinst would provide a better default anyways. So I created a PR to increase the VNC timeout for remote VNC connections: https://github.com/os-autoinst/os-autoinst/pull/1992
Updated by okurz over 2 years ago
- Has duplicate deleted (action #99345: [tools][qem] Incomplete test runs on s390x with auto_review:"backend died: Error connecting to VNC server.*s390.*Connection timed out":retry size:M)
Updated by okurz over 2 years ago
- Related to action #99345: [tools][qem] Incomplete test runs on s390x with auto_review:"backend died: Error connecting to VNC server.*s390.*Connection timed out":retry size:M added
Updated by mkittler over 2 years ago
The PR has been merged since 16.03.22 11:12.
It didn't really help that much:
openqa=> select id, t_finished, test, (select value from job_settings where job_id = jobs.id and key = 'BACKEND') as backend, arch, reason from jobs where reason like '%Error connecting to VNC server%<10%refused%' and t_finished >= '2022-03-16 11:12:00' order by id desc limit 40;
id | t_finished | test | backend | arch | reason
---------+---------------------+------------------+---------+-------+------------------------------------------------------------------------------------------------------------------
8360191 | 2022-03-21 01:37:27 | qam-minimal+base | svirt | s390x | backend died: Error connecting to VNC server <10.161.145.94:5901>: IO::Socket::INET: connect: Connection refused
(1 Zeile)
openqa=> select id, t_finished, test, (select value from job_settings where job_id = jobs.id and key = 'BACKEND') as backend, arch, reason from jobs where reason like '%Error connecting to VNC server%<op%refused%' and t_finished >= '2022-03-16 11:12:00' order by id desc limit 40;
id | t_finished | test | backend | arch | reason
---------+---------------------+-----------------------------------------------------------------------+---------+--------+----------------------------------------------------------------------------------------------------------------------------
8345350 | 2022-03-18 06:20:43 | ext4_yast | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8341382 | 2022-03-17 16:55:24 | extra_tests_gnome_sdk@Avinesh/os-autoinst-distri-opensuse#libqt5_test | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
8339330 | 2022-03-17 16:53:02 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
8338416 | 2022-03-18 00:53:49 | jeos-extratest:investigate:last_good_build:20220316-1 | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5903>: IO::Socket::INET: connect: Connection refused
8338414 | 2022-03-18 00:54:16 | jeos-extratest:investigate:retry | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
8336822 | 2022-03-17 07:56:06 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
8336821 | 2022-03-17 07:55:51 | jeos-filesystem | svirt | x86_64 | backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
(7 Zeilen)
Updated by mkittler over 2 years ago
Judging by e.g. https://openqa.suse.de/tests/8336822
[2022-03-17T08:54:46.194965+01:00] [debug] considering VNC stalled, no update for 4.05 seconds
[2022-03-17T08:54:48.198468+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
(omitted 54 similar lines)
[2022-03-17T08:55:45.289048+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
[2022-03-17T08:55:46.294484+01:00] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused
[2022-03-17T08:55:46.295165+01:00] [debug] Closing SSH serial connection with openqaw5-xen.qa.suse.de
it really tried to re-connect for a minute. It looks similar for s390x, e.g. https://openqa.suse.de/tests/8360191
[2022-03-21T02:36:15.534600+01:00] [debug] no match: 58.9s, best candidate: addon-products-20190923 (0.00)
%@[2022-03-21T02:36:16.241419+01:00] [debug] no change: 57.9s
[2022-03-21T02:36:16.541229+01:00] [debug] no match: 57.9s, best candidate: addon-products-20190923 (0.00)
Main Menu
0) <-- Back <--
1) Start Installation
2) Settings
3) Expert
4) Exit or Reboot
> [2022-03-21T02:36:17.242377+01:00] [debug] no change: 56.9s
[2022-03-21T02:36:17.544547+01:00] [debug] no match: 56.9s, best candidate: addon-products-20190923 (0.00)
[2022-03-21T02:36:18.213362+01:00] [debug] considering VNC stalled, no update for 4.00 seconds
[2022-03-21T02:36:20.216730+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.94:5901>: IO::Socket::INET: connect: Connection refused
(…)
[2022-03-21T02:37:17.290986+01:00] [warn] !!! consoles::VNC::login: Error connecting to VNC server <10.161.145.94:5901>: IO::Socket::INET: connect: Connection refused
[2022-03-21T02:37:18.294195+01:00] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
Error connecting to VNC server <10.161.145.94:5901>: IO::Socket::INET: connect: Connection refused
[2022-03-21T02:37:18.294914+01:00] [debug] Closing SSH serial connection with s390zp19.suse.de
[2022-03-21T02:37:18.295244+01:00] [debug] Destroying openQA-SUT-5 virtual machine
Likely the VNC server is just not reachable anymore at this point (and therefore extending the timeout further is pointless).
I'd say that maybe the test it just doing something that turns down the VM. However, the problem happens in different test modules (e.g. lvm or snapper_create) and most of the times it also just works.
Updated by okurz over 2 years ago
I have the feeling this might be related to #107062 as well
Updated by coolgw over 2 years ago
Another kind of error also found in recent run:
https://openqa.nue.suse.com/tests/8459244/logfile?filename=autoinst-log.txt
[2022-04-02T14:18:07.546982+02:00] [warn] !!! consoles::VNC::catch {...} : Error in VNC protocol - relogin: unexpected end of data
Updated by openqa_review over 2 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: extra_tests_gnome
https://openqa.suse.de/tests/8573572#step/boot_to_desktop/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by okurz over 2 years ago
- Due date set to 2022-05-06
- Priority changed from Normal to High
Needs to be discussed urgently as it blocks #99345 which showed up in https://progress.opensuse.org/issues?query_id=544
Updated by livdywan over 2 years ago
Discussed this a bit:
- We should add a retry for this whilst we can't track down the cause (and reduce prio)
- Maybe #107026 is worth looking into first
Updated by mkittler over 2 years ago
- Subject changed from [tools] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused" to [tools] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused":retry
- Description updated (diff)
- Priority changed from High to Normal
Updated by livdywan over 2 years ago
- Related to action #107026: Improve existing unit tests for VNC module to increase its test coverage (before doing any actual changes) size:M added
Updated by okurz over 2 years ago
- Due date deleted (
2022-05-06) - Priority changed from Normal to High
Updated by okurz over 2 years ago
- Status changed from Feedback to Blocked
- Priority changed from High to Normal
Let's consider this blocked by #107026 then. I assume that unit testing is unlikely to fix this problem directly but will help us to rule out some unlikely hypotheses as well as help when we want to introduce changes or fixes later on.
Updated by okurz over 2 years ago
- Status changed from Blocked to New
- Assignee deleted (
mkittler)
Updated by openqa_review over 2 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: offline_sles15sp3_pscc_base_all_minimal_zVM_preparation@s390x-zVM-Upgrade-m1
https://openqa.suse.de/tests/8763790#step/reconnect_mgmt_console/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by openqa_review about 2 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: online_sles15sp3_smt_basesys-srv_all_full_zypp
https://openqa.suse.de/tests/8776759#step/zypper_patch/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 56 days if nothing changes in this ticket.
Updated by openqa_review over 1 year ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: online_sles15sp2_ltss_scc_basesys-srv_all_full_y
https://openqa.suse.de/tests/10394125#step/online_migration_setup/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 328 days if nothing changes in this ticket.
Updated by JERiveraMoya over 1 year ago
- Subject changed from [tools] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused":retry to [tools] Test using svirt backend fails with auto_review:"Error connecting to VNC server.*: IO::Socket::INET: connect: Connection refused"
removing :retry
yes, it seems that goes infinite otherwise:
https://openqa.suse.de/tests/10707387#next_previous
Updated by openqa_review over 1 year ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: create_hdd_textmode@svirt-xen-pv
https://openqa.suse.de/tests/10849582#step/bootloader_svirt/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by openqa_review over 1 year ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: offline_sles15sp2_ltss_pscc_basesys-srv-desk-dev-contm-lgm-py2-tsm-wsm-pcm_all_full
https://openqa.suse.de/tests/11116302#step/install_service/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 76 days if nothing changes in this ticket.
Updated by okurz about 1 year ago
- Related to coordination #131519: [epic] Additional redundancy for OSD virtualization testing added
Updated by openqa_review about 1 year ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: qam-sles4sap_online_dvd_gnome_hana_nvdimm@64bit-ipmi-nvdimm
https://openqa.suse.de/tests/11482744#step/boot_from_pxe/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by openqa_review about 1 year ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: default@svirt-xen-pv
https://openqa.suse.de/tests/11861110
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by okurz 7 months ago
- Related to action #152578: Many incompletes with "Error connecting to VNC server <unreal6.qe.nue2.suse.org:...>" size:M added
Updated by openqa_review 6 months ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: jeos-apparmor@svirt-hyperv-uefi
https://openqa.suse.de/tests/13784636#step/bootloader_hyperv/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by rfan1 6 months ago
- Related to action #158329: [qe-core][sle15sp2][s390x]test fails in bootloader_start, ssh/vnc sessions lost with unknown reason during the installation phase 1 added
Updated by openqa_review 5 months ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: btrfs@64bit-ipmi
https://openqa.suse.de/tests/14069292#step/ipxe_install/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by livdywan 5 months ago
The ticket has been in future for 2 years, yet it is a bug reference in failing production tests. I feel like we should make sure such messages are ignored (if we de fact don't see a point in addressing it), work on improving this or replace the reference with a more relevant ticket.
Updated by openqa_review 4 months ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: jeos-containers-docker@svirt-hyperv-uefi
https://openqa.suse.de/tests/14518468#step/bootloader_hyperv/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.