Project

General

Profile

action #75364

openQA Infrastructure - action #64279: [OS upgrade] upgrade xen host openqaw5-xen.qa.suse.de

[qac] job incompletes with auto_review:"(?s)Error connecting to VNC server.*openqa.*-xen.*backend died: socket does not exist. Probably your backend instance could not start or died.*"

Added by Xiaojing_liu about 1 year ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Bugs in existing tests
Target version:
-
Start date:
2020-10-27
Due date:
% Done:

0%

Estimated time:
Difficulty:
Tags:

Description

https://openqa.suse.de/tests/4889195 is incompleted, the log shows:

[2020-10-26T18:40:02.633 CET] [debug] tests/console/snapper_jeos_cli.pm:80 called snapper_jeos_cli::rollback_and_reboot -> tests/console/snapper_jeos_cli.pm:43 called power_action_utils::power_action -> lib/power_action_utils.pm:308 called power_action_utils::assert_shutdown_and_restore_system -> lib/power_action_utils.pm:371 called testapi::select_console
[2020-10-26T18:40:02.633 CET] [debug] <<< testapi::select_console(testapi_console="sut")
/usr/lib/os-autoinst/consoles/vnc_base.pm:62:{
  "password" => "nots3cr3t",
  "hostname" => "openqaw5-xen.qa.suse.de",
  "port" => 5902
}
[2020-10-26T18:40:04.637 CET] [debug] Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2020-10-26T18:40:05.638 CET] [debug] Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2020-10-26T18:40:06.640 CET] [debug] Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2020-10-26T18:40:07.641 CET] [debug] Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2020-10-26T18:40:08.642 CET] [debug] Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2020-10-26T18:40:09.643 CET] [debug] Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2020-10-26T18:40:10.644 CET] [debug] Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2020-10-26T18:40:11.646 CET] [debug] Error connecting to VNC server <openqaw5-xen.qa.suse.de:5902>: IO::Socket::INET: connect: Connection refused
[2020-10-26T18:40:12.649 CET] [debug] Backend process died, backend errors are reported below in the following lines:
socket does not exist. Probably your backend instance could not start or died. at /usr/lib/os-autoinst/consoles/VNC.pm line 881.

[2020-10-26T18:40:12.649 CET] [debug] Closing SSH serial connection with openqaw5-xen.qa.suse.de
[2020-10-26T18:40:12.650 CET] [debug] Passing remaining frames to the video encoder
[2020-10-26T18:40:12.679 CET] [debug] Waiting for video encoder to finalize the video
[2020-10-26T18:40:12.679 CET] [debug] The built-in video encoder (pid 18516) terminated
[2020-10-26T18:40:12.679 CET] [debug] SSH disconnect hostname=openqaw5-xen.qa.suse.de,username=root
[2020-10-26T18:40:12.679 CET] [debug] sending magic and exit
[2020-10-26T18:40:12.680 CET] [debug] received magic close
[2020-10-26T18:40:12.681 CET] [debug] THERE IS NOTHING TO READ 15 4 3
[2020-10-26T18:40:12.681 CET] [debug] stopping command server 18416 because test execution ended
[2020-10-26T18:40:12.681 CET] [debug] isotovideo: informing websocket clients before stopping command server: http://127.0.0.1:20133/x4l5yowHjPYG6hIe/broadcast
[2020-10-26T18:40:12.704 CET] [debug] commands process exited: 0
[2020-10-26T18:40:12.709 CET] [debug] backend process exited: 0
[2020-10-26T18:40:12.709 CET] [debug] done with command server
[2020-10-26T18:40:12.709 CET] [debug] stopping autotest process 18422
[2020-10-26T18:40:12.709 CET] [debug] autotest received signal TERM, saving results of current test before exiting
XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":60663"
      after 28807 requests (28807 known processed) with 0 events remaining.
[2020-10-26T18:40:12.711 CET] [debug] Driver backend collected unknown process with pid 18553 and exit status: 1
xterm: fatal IO error 11 (Resource temporarily unavailable) or KillClient on X server ":60663"
[2020-10-26T18:40:12.714 CET] [debug] Driver backend collected unknown process with pid 18558 and exit status: 0
[2020-10-26T18:40:12.714 CET] [debug] Driver backend collected unknown process with pid 18555 and exit status: 84
[2020-10-26T18:40:12.714 CET] [debug] Driver backend collected unknown process with pid 18557 and exit status: 0
[2020-10-26T18:40:12.726 CET] [debug] Driver backend collected unknown process with pid 18532 and exit status: 0
[2020-10-26T18:40:12.727 CET] [debug] [autotest] process exited: 1
[2020-10-26T18:40:12.827 CET] [debug] done with autotest process
[2020-10-26T18:40:12.827 CET] [debug] isotovideo failed
[2020-10-26T18:40:12.828 CET] [debug] stopping backend process 18423
[2020-10-26T18:40:12.828 CET] [debug] done with backend process
18412: EXIT 1

see more details in https://openqa.suse.de/tests/4889195/file/autoinst-log.txt


Related issues

Has duplicate openQA Project - action #71236: job incompletes with auto_review:"backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused"Rejected2020-09-11

History

#1 Updated by okurz about 1 year ago

  • Tags set to qac, jeos, xen
  • Project changed from openQA Project to openQA Tests
  • Subject changed from job incompletes with auto_review:"backend died: socket does not exist. Probably your backend instance could not start or died.*" to [qac] job incompletes with auto_review:"(?s)Error connecting to VNC server.*openqa.*-xen.*backend died: socket does not exist. Probably your backend instance could not start or died.*"
  • Category set to Bugs in existing tests
  • Assignee set to jlausuch
  • Priority changed from Low to High

Maintenance of special worker addendums including the Xen hypervisor host is ouf of scope for SUSE QA Tools (https://progress.opensuse.org/projects/qa/wiki#Out-of-scope). As the test is about "JeOS" I will assign to QAC team.

@Xiaojing_liu I suggest to be a bit more specific with the auto_review regex to prevent matching on too many generic issues, e.g. if that symptom also appears for other backends or machines.

#2 Updated by okurz about 1 year ago

  • Has duplicate action #71236: job incompletes with auto_review:"backend died: Error connecting to VNC server <openqaw5-xen.qa.suse.de:5901>: IO::Socket::INET: connect: Connection refused" added

#3 Updated by jlausuch about 1 year ago

What am I supposed to do with this? Just tag the failed test I suppose :)
This looks like the same nature of https://progress.opensuse.org/issues/71236

#4 Updated by okurz about 1 year ago

jlausuch wrote:

What am I supposed to do with this? Just tag the failed test I suppose :)

Well, this is about incomplete jobs so "failed" tests would not really fit. And with the "auto_review" keyword in the subject line there should be no need to manually label builds ("tagging" is for builds). See more about auto-review on https://gitlab.suse.de/openqa/auto-review/ if you are interested

So what I can suggest to do is do a couple of things:

The QE Tools team is happy to offer help but does not have the capacity to improve the "special worker addendums" that are used for tests here themselves.

This looks like the same nature of #71236

yes, this is why I rejected #71236 as a duplicate of this ticket. But you should not point back to the duplicate ticket otherwise you are caught in an infinite circle ;)

#5 Updated by cfconrad about 1 year ago

  • Priority changed from High to Normal

Set to prio Normal, as this was later run's didn't show this incomplete behaviors anymore.

#6 Updated by mloviska about 1 year ago

  • Tags changed from qac, jeos, xen to qac, xen
  • Status changed from New to Blocked
  • Assignee deleted (jlausuch)
  • Parent task set to #64279

A priori we need to resolve OS upgrade. Let me set this one as blocked.

#7 Updated by okurz about 1 year ago

please be aware that #64279 is out of scope of the SUSE QE Tools team, see https://progress.opensuse.org/projects/qa/wiki/Wiki#Out-of-scope .

#8 Updated by jlausuch 5 months ago

  • Status changed from Blocked to Resolved

After XEN host update done by Martin, we haven't observed this issue.

Also available in: Atom PDF