Project

General

Profile

Actions

action #136238

closed

test incompletes with auto_review:"qemu-system-.*Address already in use":retry size:M

Added by okurz 7 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2023-09-21
Due date:
% Done:

0%

Estimated time:

Description

Observation

https://openqa.suse.de/tests/12221977

[2023-09-21T10:34:53.639587+02:00] [warn] [pid:52210] !!! : qemu-system-x86_64: -vnc :111,share=force-shared: Failed to find an available port: Address already in use

Steps to reproduce

Find jobs referencing this ticket with the help of
https://raw.githubusercontent.com/os-autoinst/scripts/master/openqa-query-for-job-label ,
call openqa-query-for-job-label poo#136238

Suggestions

  • Try and add a diagnostic message i.e. whatever is currently occupied on that port
  • find other occurences of this happening if any, e.g. waiting for openqa-label-known-issues to label more and use the above query or manual SQL or whatever. If it's a one-off then just reject and don't waste time here :)
  • Check other logs from the affected system at the recorded times, maybe some references in worker log or system journal
  • Find patterns of commonality of affected jobs, maybe just some specific worker machines or something
  • Fix it for good
Actions #1

Updated by okurz 7 months ago

  • Description updated (diff)
Actions #2

Updated by livdywan 7 months ago

  • Description updated (diff)
  • Status changed from New to Workable
Actions #3

Updated by livdywan 7 months ago

  • Subject changed from test incompletes with auto_review:"2023-09-2.*Address already in use":retry to test incompletes with auto_review:"2023-09-2.*Address already in use":retry size:M
Actions #4

Updated by okurz 7 months ago

$ openqa-query-for-job-label poo#136238
12222208|2023-09-21 08:51:06|done|incomplete|qam-SAPHanaSR_ScaleUp_PerfOpt_supportserver|backend died: QEMU exited unexpectedly, see log for details|worker29
12222206|2023-09-21 08:49:37|done|incomplete|qam-SAPHanaSR_ScaleUp_PerfOpt_WMP_node01|backend died: QEMU exited unexpectedly, see log for details|worker29
12222537|2023-09-21 08:46:50|done|incomplete|mau-autofs-server|backend died: QEMU exited unexpectedly, see log for details|worker29
12222563|2023-09-21 08:44:24|done|incomplete|qam_kernel_multipath_supportserver|backend died: QEMU exited unexpectedly, see log for details|worker29
12222105|2023-09-21 08:43:01|done|incomplete|qam_ha_priority_fencing_node01|backend died: QEMU exited unexpectedly, see log for details|worker29
12222098|2023-09-21 08:35:37|done|incomplete|qam_ha_hawk_haproxy_node02|backend died: QEMU exited unexpectedly, see log for details|worker29
12221977|2023-09-21 08:34:54|done|incomplete|qam-SAPHanaSR_ScaleUp_PerfOpt_supportserver|backend died: QEMU exited unexpectedly, see log for details|worker29
12221915|2023-09-21 08:25:43|done|incomplete|qam_2nodes_02|backend died: QEMU exited unexpectedly, see log for details|worker29
12221884|2023-09-21 08:19:30|done|incomplete|qam_ha_rolling_update_node01|backend died: QEMU exited unexpectedly, see log for details|worker29
12221877|2023-09-21 08:17:42|done|incomplete|qam_ha_hawk_client|backend died: QEMU exited unexpectedly, see log for details|worker29
Actions #5

Updated by livdywan 7 months ago

  • Subject changed from test incompletes with auto_review:"2023-09-2.*Address already in use":retry size:M to test incompletes with auto_review:"qemu-system-.*Address already in use":retry size:M

We found another match via autoreview, however it seems a different issue since this is yast's xvnc and not the one qemu exposes:

*** Starting YaST2 ***
2023-09-26T00:07:56.494613-04:00 install systemd[6103]: xvnc.socket: Failed to create listening socket ([::]:5901): Address already in use
[FAILED] Failed to listen on Xvnc Server.
2023-09-26T00:07:56.504394-04:00 install systemd[1]: xvnc.socket: Failed to receive listening socket ([::]:5901): Input/output error
2023-09-26T00:07:56.504490-04:00 install systemd[1]: Failed to listen on Xvnc Server.
removed '/root/.vnc/passwd.yast'
%@

So I'm removing the label and tweaking the regex.

Actions #6

Updated by livdywan 7 months ago

  • Status changed from Workable to Feedback
  • Assignee set to livdywan

Also asigning since out of three people that looks better than "nobody" ;-)

Actions #7

Updated by livdywan 7 months ago

./openqa-query-for-job-label poo#136238
12312274|2023-09-26 23:35:23|done|failed|sle_autoyast_support_image_gnome_12sp5||worker2

Doesn't seem to have occurred again so far

Actions #8

Updated by livdywan 7 months ago

  • Status changed from Feedback to Resolved
  • Try and add a diagnostic message i.e. whatever is currently occupied on that port

I feel that I don't understand the logic well enough to suggest something in the realm of 15 minutes. It seems that os-autoinst defaults VNC to 90, and in the trivial case this gets used with -vnc .... This can also contain more arguments and the VNC console later uses 5900 + VNC. What I don't see is what seems to assign different port values which we see in tests, though, so maybe there's already some querying of open ports that could be improved? Most likely that's the code that should be extended to provide more useful error handling.

With that I'm wrapping the ticket up.

Actions

Also available in: Atom PDF