action #180641
closed[sporadic] Tests fail with auto_review:"hostfwd.*Could not set up host forwarding rule":retry
0%
Description
Log:
[2025-04-09T15:14:20.237552Z] [warn] [pid:75678] !!! : qemu-system-aarch64: -netdev user,id=qanet0,hostfwd=tcp::39102-:39102: Could not set up host forwarding rule 'tcp::39102-:39102'
Updated by okurz about 1 month ago
- Tags set to infra, reactive work, o3
- Category set to Regressions/Crashes
- Priority changed from Normal to High
- Target version set to Ready
Updated by mkittler 29 days ago
- Status changed from New to In Progress
- Assignee set to mkittler
- Priority changed from High to Normal
Maybe this happened due to a leftover QEMU process or some other process blocking the ports but it is hard to tell now.
The next jobs worked again: https://openqa.opensuse.org/admin/workers/986
The restarted job also passed: https://openqa.opensuse.org/tests/4982711
So definitely not a persistent problem.
I couldn't find anything in the worker journal although I might not have been using the best search terms (not quite sure what to look for).
Just for the record, that's the full error message including the QEMU version:
[2025-04-09T15:14:20.237265Z] [debug] [pid:75678] QEMU: QEMU emulator version 8.2.9 (SUSE Linux Enterprise 15)
[2025-04-09T15:14:20.237411Z] [debug] [pid:75678] QEMU: Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers
[2025-04-09T15:14:20.237552Z] [warn] [pid:75678] !!! : qemu-system-aarch64: -netdev user,id=qanet0,hostfwd=tcp::39102-:39102: Could not set up host forwarding rule 'tcp::39102-:39102'
That's the full QEMU invocation:
[2025-04-09T15:14:20.042127Z] [debug] [pid:75678] starting: /usr/bin/qemu-system-aarch64 -device virtio-gpu-pci,edid=on,xres=1024,yres=768 -only-migratable -chardev ringbuf,id=serial0,logfile=serial0,logappend=on -serial chardev:serial0 -audiodev none,id=snd0 -device intel-hda -device hda-output,audiodev=snd0 -m 2048 -machine virt,gic-version=host -cpu host -mem-prealloc -mem-path /dev/hugepages/ -netdev user,id=qanet0,hostfwd=tcp::39102-:39102 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0 -boot menu=on,splash-time=5000 -device nec-usb-xhci -device usb-tablet -device usb-kbd -smp 1 -enable-kvm -no-shutdown -vnc :102,share=force-shared -device virtio-serial -chardev pipe,id=virtio_console,path=virtio_console,logfile=virtio_console.log,logappend=on -device virtconsole,chardev=virtio_console,name=org.openqa.console.virtio_console -chardev pipe,id=virtio_console_user,path=virtio_console_user,logfile=virtio_console_user.log,logappend=on -device virtconsole,chardev=virtio_console_user,name=org.openqa.console.virtio_console_user -chardev socket,path=qmp_socket,server=on,wait=off,id=qmp_socket,logfile=qmp_socket.log,logappend=on -qmp chardev:qmp_socket -S -device virtio-scsi-pci,id=scsi0 -blockdev driver=file,node-name=hd0-overlay0-file,filename=/var/lib/openqa/pool/12/raid/hd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=hd0-overlay0,file=hd0-overlay0-file,cache.no-flush=on,discard=unmap -device virtio-blk-device,id=hd0-device,drive=hd0-overlay0,bootindex=0,serial=hd0 -blockdev driver=file,node-name=cd0-overlay0-file,filename=/var/lib/openqa/pool/12/raid/cd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=cd0-overlay0,file=cd0-overlay0-file,cache.no-flush=on,discard=unmap -device scsi-cd,id=cd0-device,drive=cd0-overlay0,serial=cd0 -drive id=pflash-code-overlay0,if=pflash,file=/var/lib/openqa/pool/12/raid/pflash-code-overlay0,unit=0,readonly=on -drive id=pflash-vars-overlay0,if=pflash,file=/var/lib/openqa/pool/12/raid/pflash-vars-overlay0,unit=1
In the good case (the restarted job) the hostfwd
parameter looks the same:
[2025-04-09T15:33:15.779363Z] [debug] [pid:25459] starting: /usr/bin/qemu-system-aarch64 -device virtio-gpu-pci,edid=on,xres=1024,yres=768 -only-migratable -chardev ringbuf,id=serial0,logfile=serial0,logappend=on -serial chardev:serial0 -audiodev none,id=snd0 -device intel-hda -device hda-output,audiodev=snd0 -m 2048 -machine virt,gic-version=host -cpu host -mem-prealloc -mem-path /dev/hugepages/ -netdev user,id=qanet0,hostfwd=tcp::39100-:39100 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0 -boot menu=on,splash-time=5000 -device nec-usb-xhci -device usb-tablet -device usb-kbd -smp 1 -enable-kvm -no-shutdown -vnc :100,share=force-shared -device virtio-serial -chardev pipe,id=virtio_console,path=virtio_console,logfile=virtio_console.log,logappend=on -device virtconsole,chardev=virtio_console,name=org.openqa.console.virtio_console -chardev pipe,id=virtio_console_user,path=virtio_console_user,logfile=virtio_console_user.log,logappend=on -device virtconsole,chardev=virtio_console_user,name=org.openqa.console.virtio_console_user -chardev socket,path=qmp_socket,server=on,wait=off,id=qmp_socket,logfile=qmp_socket.log,logappend=on -qmp chardev:qmp_socket -S -device virtio-scsi-pci,id=scsi0 -blockdev driver=file,node-name=hd0-overlay0-file,filename=/var/lib/openqa/pool/10/raid/hd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=hd0-overlay0,file=hd0-overlay0-file,cache.no-flush=on,discard=unmap -device virtio-blk-device,id=hd0-device,drive=hd0-overlay0,bootindex=0,serial=hd0 -blockdev driver=file,node-name=cd0-overlay0-file,filename=/var/lib/openqa/pool/10/raid/cd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=cd0-overlay0,file=cd0-overlay0-file,cache.no-flush=on,discard=unmap -device scsi-cd,id=cd0-device,drive=cd0-overlay0,serial=cd0 -drive id=pflash-code-overlay0,if=pflash,file=/var/lib/openqa/pool/10/raid/pflash-code-overlay0,unit=0,readonly=on -drive id=pflash-vars-overlay0,if=pflash,file=/var/lib/openqa/pool/10/raid/pflash-vars-overlay0,unit=1
I'm lowering the prio as it only happened once and I wouldn't know what else to check immediately.
Updated by mkittler 29 days ago
I was wondering where the hostfwd
parameter comes from. So I checked os-autoinst and found nothing. Then I checked vars.json
and found NICTYPE_USER_OPTIONS=hostfwd=tcp::39102-:39102
. Then I searched for this variable in the openSUSE tests and found:
my $yuiport = get_port();
if (is_qemu) {
# On qemu we connect to the worker using port forwarding
set_var('NICTYPE_USER_OPTIONS', join(' ', grep($_, (
get_var('NICTYPE_USER_OPTIONS'),
"hostfwd=tcp::$yuiport-:$yuiport"))));
(see https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/lib/YuiRestClient.pm)
The port is computed like this:
sub init_port {
$port = get_var('YUI_START_PORT', 39000);
$port += get_var('VNC') =~ /(?<vncport>\d+)/ ? $+{vncport} : int(rand(1000));
die "Cannot set port for YUI REST API" unless $port;
…
Where the VNC port is computed like this:
use constant VNCPORT_OFFSET => $ENV{VNCPORT_OFFSET} // 90;
…
$ENV{VNC} = $instance_number + VNCPORT_OFFSET;
(done by the openQA worker, so $instance_number
is the worker slot)
So the VNC port is a low number like 102 in this case.
So we end up with the formula: 39000 + $instance_number + 90
(39090 + $instance_number
)
That means in theory we have distinct ports per worker slot in a port range that is not otherwise assigned. Not sure why it didn't work this one time then.
Updated by openqa_review 29 days ago
- Due date set to 2025-04-26
Setting due date based on mean cycle time of SUSE QE Tools
Updated by mkittler 26 days ago
- Project changed from openQA Infrastructure (public) to openQA Project (public)
- Subject changed from `openqaworker-arm22` failed with `Could not set up host forwarding rule 'tcp::39102-:39102'` to [sporadic] Tests fail with auto_review:"hostfwd.*Could not set up host forwarding rule":retry
- Due date deleted (
2025-04-26) - Category deleted (
Regressions/Crashes) - Status changed from In Progress to Resolved
This also happened before on openqaworker-arm21:6:
https://openqa.opensuse.org/tests/4967633:
[2025-04-03T11:51:43.158454Z] [warn] [pid:78373] !!! : qemu-system-aarch64: -netdev user,id=qanet0,hostfwd=tcp::39096-:39096: Could not set up host forwarding rule 'tcp::39096-:39096'
This was on a different host/slot/port so this ticket is not specific to openqaworker-arm22. It is probably also not an infra ticket.
As mentioned before this is probably due to a leftover QEMU process. If that's true this would make this ticket a duplicate of #180110. So I am marking this ticket as such for now.
Updated by mkittler 26 days ago
- Is duplicate of action #180110: [sporadic] auto_review:"Failed to find an available port: Address already in use":retry, produces incomplete jobs on OSD, multiple machines added