Project

General

Profile

Actions

action #14334

closed

job incomplete: "could not configure /dev/net/tun (tap00): Device or resource busy"

Added by okurz over 7 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Bugs in existing tests
Start date:
2016-10-20
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

observation

t#621407 is incomplete. https://openqa.suse.de/tests/621407/file/autoinst-log.txt shows the error details

DIE Died at /usr/lib/os-autoinst/consoles/vnc_base.pm line 76.

 at /usr/lib/os-autoinst/backend/baseclass.pm line 73.
    backend::baseclass::die_handler('Died at /usr/lib/os-autoinst/consoles/vnc_base.pm line 76.\x{a}') called at /usr/lib/os-autoinst/consoles/vnc_base.pm line 76
    consoles::vnc_base::catch {...} ('Error connecting to host <localhost>\x{a}$VAR1 = bless( {\x{a}       ...') called at /usr/lib/perl5/vendor_perl/5.18.2/Try/Tiny.pm line 104
    Try::Tiny::try('CODE(0x61b1c78)', 'Try::Tiny::Catch=REF(0x627a2b8)') called at /usr/lib/os-autoinst/consoles/vnc_base.pm line 80
    consoles::vnc_base::connect_vnc('consoles::vnc_base=HASH(0x61b9a20)', 'HASH(0x5f79320)') called at /usr/lib/os-autoinst/consoles/vnc_base.pm line 37
    consoles::vnc_base::activate('consoles::vnc_base=HASH(0x61b9a20)') called at /usr/lib/os-autoinst/consoles/console.pm line 74
    consoles::console::select('consoles::vnc_base=HASH(0x61b9a20)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 469
    backend::baseclass::select_console('backend::qemu=HASH(0x60c04b0)', 'HASH(0x61b99a8)') called at /usr/lib/os-autoinst/backend/qemu.pm line 679
    backend::qemu::start_qemu('backend::qemu=HASH(0x60c04b0)') called at /usr/lib/os-autoinst/backend/qemu.pm line 98
    backend::qemu::do_start_vm('backend::qemu=HASH(0x60c04b0)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 255
    backend::baseclass::start_vm('backend::qemu=HASH(0x60c04b0)', undef) called at /usr/lib/os-autoinst/backend/baseclass.pm line 68
    backend::baseclass::handle_command('backend::qemu=HASH(0x60c04b0)', 'HASH(0x6161500)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 427
    backend::baseclass::check_socket('backend::qemu=HASH(0x60c04b0)', 'IO::Handle=GLOB(0x608c5b8)') called at /usr/lib/os-autoinst/backend/qemu.pm line 893
    backend::qemu::check_socket('backend::qemu=HASH(0x60c04b0)', 'IO::Handle=GLOB(0x608c5b8)', 0) called at /usr/lib/os-autoinst/backend/baseclass.pm line 209
    eval {...} called at /usr/lib/os-autoinst/backend/baseclass.pm line 171
    backend::baseclass::run_capture_loop('backend::qemu=HASH(0x60c04b0)', 'IO::Select=ARRAY(0x60095e8)') called at /usr/lib/os-autoinst/backend/baseclass.pm line 120
    backend::baseclass::run('backend::qemu=HASH(0x60c04b0)', 6, 10) called at /usr/lib/os-autoinst/backend/driver.pm line 85
    backend::driver::start('backend::driver=HASH(0x3fbc450)') called at /usr/lib/os-autoinst/backend/driver.pm line 48
    backend::driver::new('backend::driver', 'qemu') called at /usr/bin/isotovideo line 177
    main::init_backend() called at /usr/bin/isotovideo line 236
05:44:49.2515 13306 waitpid for 13314 returned 13314
05:44:49.2519 13306 QEMU: qemu-system-x86_64: -netdev tap,id=qanet0,ifname=tap00,script=/etc/qemu-ifup-br0,downscript=no: could not configure /dev/net/tun (tap00): Device or resource busy
05:44:49.2520 13306 QEMU: qemu-system-x86_64: -netdev tap,id=qanet0,ifname=tap00,script=/etc/qemu-ifup-br0,downscript=no: Device 'tap' could not be initialized

reproducible

Looking at https://openqa.suse.de/tests?hoursfresh=24&match=hacluster-supportserver shows that this happens a lot but not every time

problem

H1. problem with setting up the tun device
H1.1. tun device fails for high instance worker numbers (see #14334#note-7)
H2. conflict with other jobs accessing the same devices at the same time

workaround

restart seems to help

  1. Look if other instance of supportserver is running (which is using the same tun device)
  2. If this instance is in "zombie modus" (parralels are incomplete, passed, failed, parallel restarted, also the support server is no more necessary for any job) Cancel this zombie instance.
  3. Restart the parallels (the supportserver will be retriggered)

Related issues 2 (0 open2 closed)

Related to openQA Project - action #19432: [multimachine][scheduling] Fail of one multi-machine jobs cause restart all of them without checking state of othersResolvedokurz2017-05-30

Actions
Blocks openQA Tests - action #15416: [tools] bridge device seems to have disappeared for HA testsResolvedhsehic2016-09-16

Actions
Actions

Also available in: Atom PDF