action #106654
closed[ipmi][openqa][vnc] Massive test run failures with 'IO::Socket::INET: connect: Connection refused' due to "Use of uninitialized value.*connect_timeout in addition.*consoles/VNC.pm line 13.*":retry
Description
Observation¶
openQA test run failures are arising more and more due to the same error:
[2022-02-11T01:34:05.510069+01:00] [info] ::: basetest::runtest: # Test died: Error connecting to VNC server localhost:38869: IO::Socket::INET: connect: Connection refused at /usr/lib/os-autoinst/testapi.pm line 1762.
But all manual ssh attempts succeeded.
Please refer to all relevant test runs as below:
https://openqa.suse.de/tests/8140400
https://openqa.suse.de/tests/8140119
https://openqa.suse.de/tests/8139589
https://openqa.suse.de/tests/8137426
https://openqa.suse.de/tests/8137424
Steps to reproduce¶
Find jobs referencing this ticket with the help of
https://raw.githubusercontent.com/os-autoinst/scripts/master/openqa-query-for-job-label ,
call openqa-query-for-job-label poo#106654
Updated by okurz over 2 years ago
- Project changed from openQA Infrastructure to openQA Project
- Category set to Regressions/Crashes
- Status changed from New to Feedback
- Assignee set to okurz
- Target version set to Ready
please follow the ticket template from https://progress.opensuse.org/projects/openqav3/wiki/#Defects . I saw that the problems only appear with the new build. Did you crosscheck with an older build? Did you crosscheck with an older state of tests?
I triggered investigation runs now with echo https://openqa.suse.de/tests/8140400 | host=openqa.suse.de openqa-investigate
, comment available in https://openqa.suse.de/tests/8140400#comment-485767 . Awaiting test results.
Updated by okurz over 2 years ago
- Related to action #105882: Test using svirt backend fails with auto_review:"Error connecting to VNC server.*localhost.*Connection refused" added
Updated by okurz over 2 years ago
- Status changed from Feedback to In Progress
- uefi-gi-guest_sles12sp5-on-host_developing-xen:investigate:retry: https://openqa.suse.de/t8140847 passed
- uefi-gi-guest_sles12sp5-on-host_developing-xen:investigate:last_good_tests:78a8b9064552b95543ab89a27a5678bb7987bbc8: https://openqa.suse.de/t8140848 failed
- uefi-gi-guest_sles12sp5-on-host_developing-xen:investigate:last_good_build:91.2: https://openqa.suse.de/t8140849 failed
- uefi-gi-guest_sles12sp5-on-host_developing-xen:investigate:last_good_tests_and_build:78a8b9064552b95543ab89a27a5678bb7987bbc8+91.2: https://openqa.suse.de/t8140935 passed
so inconclusive results. I triggered build=okurz_investigation_poo106654; end=040 openqa-clone-set https://openqa.suse.de/tests/8140935 ${build}_uefi-gi-guest_sles12sp5-on-host_developing-xen:investigate:last_good_tests_and_build:78a8b9064552b95543ab89a27a5678bb7987bbc8+91.2 SCHEDULE=tests/boot/boot_from_pxe
Updated by mkittler over 2 years ago
This PR should fix the problem, I'm currently testing it on openqaworker2: https://github.com/os-autoinst/os-autoinst/pull/1953
The test job I've created is still scheduled: https://openqa.suse.de/tests/8141551
Updated by okurz over 2 years ago
- Related to action #99663: Use more perl signatures - os-autoinst size:M added
Updated by openqa_review over 2 years ago
- Due date set to 2022-02-26
Setting due date based on mean cycle time of SUSE QE Tools
Updated by okurz over 2 years ago
- Subject changed from [ipmi][openqa][vnc] Massive test run failures due to "IO::Socket::INET: connect: Connection refused" to [ipmi][openqa][vnc] Massive test run failures with 'IO::Socket::INET: connect: Connection refused' due to auto_review:"Use of uninitialized value.*connect_timeout in addition.*consoles/VNC.pm line 13.*":retry
- Description updated (diff)
The fix was deployed with https://gitlab.suse.de/openqa/osd-deployment/-/pipelines/318539 . https://openqa.suse.de/tests/overview?build=okurz_investigation_poo106654_uefi-gi-guest_sles12sp5-on-host_developing-xen%3Ainvestigate%3Alast_good_tests_and_build%3A78a8b9064552b95543ab89a27a5678bb7987bbc8%2B91.2&distri=sle&version=15-SP4 shows 39/40 passed for tests that either just passed in before or with the fix applied. One job failed which was running before deployment. I retriggered this job as https://openqa.suse.de/tests/8152116 , currently running. Adding https://github.com/os-autoinst/scripts/blob/master/README.md#auto-review---automatically-detect-known-issues-in-openqa-jobs-label-openqa-jobs-with-ticket-references-and-optionally-retrigger control parameters.
Updated by waynechen55 over 2 years ago
okurz wrote:
please follow the ticket template from https://progress.opensuse.org/projects/openqav3/wiki/#Defects . I saw that the problems only appear with the new build. Did you crosscheck with an older build? Did you crosscheck with an older state of tests?
I triggered investigation runs now with
echo https://openqa.suse.de/tests/8140400 | host=openqa.suse.de openqa-investigate
, comment available in https://openqa.suse.de/tests/8140400#comment-485767 . Awaiting test results.
Templates will be followed.
okurz wrote:
The fix was deployed with https://gitlab.suse.de/openqa/osd-deployment/-/pipelines/318539 . https://openqa.suse.de/tests/overview?build=okurz_investigation_poo106654_uefi-gi-guest_sles12sp5-on-host_developing-xen%3Ainvestigate%3Alast_good_tests_and_build%3A78a8b9064552b95543ab89a27a5678bb7987bbc8%2B91.2&distri=sle&version=15-SP4 shows 39/40 passed for tests that either just passed in before or with the fix applied. One job failed which was running before deployment. I retriggered this job as https://openqa.suse.de/tests/8152116 , currently running. Adding https://github.com/os-autoinst/scripts/blob/master/README.md#auto-review---automatically-detect-known-issues-in-openqa-jobs-label-openqa-jobs-with-ticket-references-and-optionally-retrigger control parameters.
Thanks. I will follow up on the results.
Updated by waynechen55 over 2 years ago
mkittler wrote:
This PR should fix the problem, I'm currently testing it on openqaworker2: https://github.com/os-autoinst/os-autoinst/pull/1953
The test job I've created is still scheduled: https://openqa.suse.de/tests/8141551
Thanks for your info.
Updated by waynechen55 over 2 years ago
- Status changed from In Progress to Resolved
Updated by openqa_review over 2 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: jeos@RPi2B
https://openqa.opensuse.org/tests/2191462
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by okurz over 2 years ago
- Due date deleted (
2022-02-26) - Status changed from Resolved to Feedback
The above job is 21 days old but still the latest job in this scenario. I retriggered it in the hope to end up with a successful cloned job. Currently running https://openqa.opensuse.org/tests/2236473
Updated by okurz over 2 years ago
- Subject changed from [ipmi][openqa][vnc] Massive test run failures with 'IO::Socket::INET: connect: Connection refused' due to auto_review:"Use of uninitialized value.*connect_timeout in addition.*consoles/VNC.pm line 13.*":retry to [ipmi][openqa][vnc] Massive test run failures with 'IO::Socket::INET: connect: Connection refused' due to "Use of uninitialized value.*connect_timeout in addition.*consoles/VNC.pm line 13.*":retry
- Status changed from Feedback to Resolved
https://openqa.opensuse.org/tests/2282397 looks ok, at least not failing as in before.