action #174259
opencoordination #127031: [saga][epic] openQA for SUSE customers
coordination #165393: [epic] Improved code coverage in openQA
[sporadic] isotovideo fails complaining about still existing testfd filehandle size:M
Description
Observation¶
See also #166445
The inner test is incomplete despite the only step simple_boot is passed.
It seems in some circumstances the process forked by isotovideo exits before the tests_done handling can be worked on and testdf
closed, so isotovideo complains about the still open filehandle. https://github.com/os-autoinst/os-autoinst/blob/master/script/isotovideo#L144
openQA test in scenario openqa-Tumbleweed-dev-x86_64-openqa_from_bootstrap@64bit-2G fails in
tests
autoinst-log.txt for the inner tests shows:
[2024-09-11T07:00:30.593688-04:00] [debug] [pid:51558] [autotest] process exited: 0
…
[2024-09-11T07:00:30.705630-04:00] [debug] [pid:51578] backend got TERM
[2024-09-11T07:00:30.705933-04:00] [info] [pid:51578] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
[2024-09-11T07:00:31.758537-04:00] [debug] [pid:51578] Passing remaining frames to the video encoder
[2024-09-11T07:00:31.794569-04:00] [debug] [pid:51578] Waiting for video encoder to finalize the video
[2024-09-11T07:00:31.794634-04:00] [debug] [pid:51578] The built-in video encoder (pid 51580) terminated
[2024-09-11T07:00:31.794967-04:00] [debug] [pid:51578] QEMU: qemu-system-x86_64: terminating on signal 15 from pid 51578 (/usr/bin/isotovideo: backend)
[2024-09-11T07:00:31.795581-04:00] [debug] [pid:51578] sending magic and exit
[2024-09-11T07:00:31.908863-04:00] [debug] [pid:51558] done with backend process
51558: EXIT 1
[2024-09-11T07:00:31.927014-04:00] [info] Isotovideo exit status: 1
[2024-09-11T07:00:31.965003-04:00] [info] +++ worker notes +++
[2024-09-11T07:00:31.965099-04:00] [info] End time: 2024-09-11 11:00:31
[2024-09-11T07:00:31.965147-04:00] [info] Result: died
[2024-09-11T07:00:31.974998-04:00] [info] Uploading video.ogv
[2024-09-11T07:00:32.010558-04:00] [info] Uploading autoinst-log.txt
also from the worker
[2024-09-11T18:03:44.887585Z] [error] REST-API error (POST https://openqa.opensuse.org/api/v1/jobs/4472003/status): Connection error: Premature connection close (remaining tries: 59)
Steps to reproduce¶
Steps to reproduce with isotovideo
Download HDD_1 from openqa_install+publish test
Download vars.json
Run isotovideo until you hit the issue (~1/100 times)
wget https://openqa.opensuse.org/tests/4699317/asset/hdd/opensuse-Tumbleweed-x86_64-20241210-minimalx@64bit.qcow2
wget https://openqa.opensuse.org/tests/4699317/file/vars.json
for i in $(seq 1 20); do echo RUN $i; isotovideo -e CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-openQA NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-needles-openQA PRODUCTDIR=. 2>log || break; done
Also see https://progress.opensuse.org/issues/174259#note-7
Acceptance Criteria¶
- AC1: the scenario openqa_from_bootstrap is consistently stable with fail ratio < 0.1% with inner openQA job not incomplete
Suggestions¶
- Run the inner example tests in a VM DONE didn't reproduce it
- Try to run the openqa-in-openqa test, needing nested virt or emulating
- Change the test code to schedule the inner test multiple times instead of once and run openqa-in-openqa many times and see if it makes a difference. Is it more often reproducible? Does it only fail for the first inner test?
- Run openqa-in-openqa test and pause the test and schedule the inner test manually multiple times
- Look for https://github.com/os-autoinst/os-autoinst/blob/master/script/isotovideo#L151 "isotovideo failed' from uploaded logs from the inner openQA job to detect the problem