action #174259
opencoordination #127031: [saga][epic] openQA for SUSE customers
coordination #165393: [epic] Improved code coverage in openQA
[sporadic] isotovideo fails complaining about still existing testfd filehandle size:M
Description
Observation¶
See also #166445
The inner test is incomplete despite the only step simple_boot is passed.
It seems in some circumstances the process forked by isotovideo exits before the tests_done handling can be worked on and testdf
closed, so isotovideo complains about the still open filehandle. https://github.com/os-autoinst/os-autoinst/blob/master/script/isotovideo#L144
openQA test in scenario openqa-Tumbleweed-dev-x86_64-openqa_from_bootstrap@64bit-2G fails in
tests
autoinst-log.txt for the inner tests shows:
[2024-09-11T07:00:30.593688-04:00] [debug] [pid:51558] [autotest] process exited: 0
…
[2024-09-11T07:00:30.705630-04:00] [debug] [pid:51578] backend got TERM
[2024-09-11T07:00:30.705933-04:00] [info] [pid:51578] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
[2024-09-11T07:00:31.758537-04:00] [debug] [pid:51578] Passing remaining frames to the video encoder
[2024-09-11T07:00:31.794569-04:00] [debug] [pid:51578] Waiting for video encoder to finalize the video
[2024-09-11T07:00:31.794634-04:00] [debug] [pid:51578] The built-in video encoder (pid 51580) terminated
[2024-09-11T07:00:31.794967-04:00] [debug] [pid:51578] QEMU: qemu-system-x86_64: terminating on signal 15 from pid 51578 (/usr/bin/isotovideo: backend)
[2024-09-11T07:00:31.795581-04:00] [debug] [pid:51578] sending magic and exit
[2024-09-11T07:00:31.908863-04:00] [debug] [pid:51558] done with backend process
51558: EXIT 1
[2024-09-11T07:00:31.927014-04:00] [info] Isotovideo exit status: 1
[2024-09-11T07:00:31.965003-04:00] [info] +++ worker notes +++
[2024-09-11T07:00:31.965099-04:00] [info] End time: 2024-09-11 11:00:31
[2024-09-11T07:00:31.965147-04:00] [info] Result: died
[2024-09-11T07:00:31.974998-04:00] [info] Uploading video.ogv
[2024-09-11T07:00:32.010558-04:00] [info] Uploading autoinst-log.txt
also from the worker
[2024-09-11T18:03:44.887585Z] [error] REST-API error (POST https://openqa.opensuse.org/api/v1/jobs/4472003/status): Connection error: Premature connection close (remaining tries: 59)
Steps to reproduce¶
Steps to reproduce with isotovideo
Download HDD_1 from openqa_install+publish test
Download vars.json
Run isotovideo until you hit the issue (~1/100 times)
wget https://openqa.opensuse.org/tests/4699317/asset/hdd/opensuse-Tumbleweed-x86_64-20241210-minimalx@64bit.qcow2
wget https://openqa.opensuse.org/tests/4699317/file/vars.json
for i in $(seq 1 20); do echo RUN $i; isotovideo -e CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-openQA NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-needles-openQA PRODUCTDIR=. 2>log || break; done
Also see https://progress.opensuse.org/issues/174259#note-7
Acceptance Criteria¶
- AC1: the scenario openqa_from_bootstrap is consistently stable with fail ratio < 0.1% with inner openQA job not incomplete
Suggestions¶
- Run the inner example tests in a VM DONE didn't reproduce it
- Try to run the openqa-in-openqa test, needing nested virt or emulating
- Change the test code to schedule the inner test multiple times instead of once and run openqa-in-openqa many times and see if it makes a difference. Is it more often reproducible? Does it only fail for the first inner test?
- Run openqa-in-openqa test and pause the test and schedule the inner test manually multiple times
- Look for https://github.com/os-autoinst/os-autoinst/blob/master/script/isotovideo#L151 "isotovideo failed' from uploaded logs from the inner openQA job to detect the problem
Updated by tinita 7 days ago
- Copied from action #166445: [openQA-in-openQA][sporadic] test fails in tests, simple_boot incomplete auto_review:"no candidate needle.*openqa-test-details.*matched":retry size:S added
Updated by tinita 7 days ago
- Subject changed from [openQA-in-openQA][sporadic] test fails in tests, simple_boot incomplete auto_review:"no candidate needle.*openqa-test-details.*matched":retry to [sporadic] test fails in tests, simple_boot incomplete auto_review:"no candidate needle.*openqa-test-details.*matched":retry
Updated by jbaier_cz 6 days ago
- Description updated (diff)
I added steps to reproduce inside a distrobox (just a plain opensuse/distrobox:latest with os-autoinst-devel and openQA-devel installed).
RUN 12
Web application available at http://[::]:20183
GOT GO
3292083: EXIT 101
For some reason, I was not able to reproduce with podman and isotovideo:qemu-x86 container image.
Updated by openqa_review 5 days ago
- Due date set to 2024-12-28
Setting due date based on mean cycle time of SUSE QE Tools
Updated by tinita 2 days ago
So, running the inner example tests in my VM didn't reproduce it.
I now tried to run the openqa-in-openqa test, but nested virtualization is not enabled in the VM. I asked for help in the openplatform channel.
Additional ideas:
- Change the test code to schedule the inner test multiple times instead of once and run openqa-in-openqa many times and see if it makes a difference. Is it more often reproducible? Does it only fail for the first inner test?
- Run openqa-in-openqa test and pause the test and schedule the inner test manually multiple times
Updated by tinita 1 day ago
- Status changed from In Progress to Workable
I wanted to run openqa-in-openqa tests in that VM, but wasn't able to get nested virtualization working. I asked in the openplatform channel and got some help, but still something seems to be missing.
Setting back to workable due to absence.
Updated by livdywan about 1 hour ago
- Subject changed from [sporadic] isotovideo fails complaining about still existing testfd filehandle to [sporadic] isotovideo fails complaining about still existing testfd filehandle size:M
- Description updated (diff)