Project

General

Profile

action #174259

Updated by okurz 2 months ago

## Observation 

 See also #166445 

 The inner test is incomplete despite the only step simple_boot is passed. 
 It seems in some circumstances the process forked by isotovideo exits before the tests_done handling can be worked on and `testdf` closed, so isotovideo complains about the still open filehandle. https://github.com/os-autoinst/os-autoinst/blob/master/script/isotovideo#L144 

 openQA test in scenario openqa-Tumbleweed-dev-x86_64-openqa_from_bootstrap@64bit-2G fails in 
 [tests](https://openqa.opensuse.org/tests/4472003#step/tests/7) 

 autoinst-log.txt for the inner tests shows: 
 ``` 
 [2024-09-11T07:00:30.593688-04:00] [debug] [pid:51558] [autotest] process exited: 0 
 … 
 [2024-09-11T07:00:30.705630-04:00] [debug] [pid:51578] backend got TERM 
 [2024-09-11T07:00:30.705933-04:00] [info] [pid:51578] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json 
 [2024-09-11T07:00:31.758537-04:00] [debug] [pid:51578] Passing remaining frames to the video encoder 
 [2024-09-11T07:00:31.794569-04:00] [debug] [pid:51578] Waiting for video encoder to finalize the video 
 [2024-09-11T07:00:31.794634-04:00] [debug] [pid:51578] The built-in video encoder (pid 51580) terminated 
 [2024-09-11T07:00:31.794967-04:00] [debug] [pid:51578] QEMU: qemu-system-x86_64: terminating on signal 15 from pid 51578 (/usr/bin/isotovideo: backend) 
 [2024-09-11T07:00:31.795581-04:00] [debug] [pid:51578] sending magic and exit 
 [2024-09-11T07:00:31.908863-04:00] [debug] [pid:51558] done with backend process 
 51558: EXIT 1 
 [2024-09-11T07:00:31.927014-04:00] [info] Isotovideo exit status: 1 
 [2024-09-11T07:00:31.965003-04:00] [info] +++ worker notes +++ 
 [2024-09-11T07:00:31.965099-04:00] [info] End time: 2024-09-11 11:00:31 
 [2024-09-11T07:00:31.965147-04:00] [info] Result: died 
 [2024-09-11T07:00:31.974998-04:00] [info] Uploading video.ogv 
 [2024-09-11T07:00:32.010558-04:00] [info] Uploading autoinst-log.txt 
 ``` 

 also from the worker 

 ``` 
 [2024-09-11T18:03:44.887585Z] [error] REST-API error (POST https://openqa.opensuse.org/api/v1/jobs/4472003/status): Connection error: Premature connection close (remaining tries: 59) 
 ``` 

 ## Steps to reproduce 

 Steps to reproduce with isotovideo 
 Download HDD_1 from latest openqa_install+publish test from https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=openqa&flavor=dev&machine=64bit-2G&test=openqa_from_bootstrap&version=Tumbleweed&status=done#next_previous 
 Download vars.json 
 Run isotovideo until you hit the issue (~1/100 times) 

 For example for an older build that would look like this: 

 ``` 
 wget https://openqa.opensuse.org/tests/4699317/asset/hdd/opensuse-Tumbleweed-x86_64-20241210-minimalx@64bit.qcow2 
 wget https://openqa.opensuse.org/tests/4699317/file/vars.json 
 for i in $(seq 1 20); do echo RUN $i; isotovideo -e CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-openQA NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-needles-openQA PRODUCTDIR=. 2>log || break; done 

 
 ``` 

 
 Also see #174259-7 https://progress.opensuse.org/issues/174259#note-7 

 ## Acceptance Criteria 
 * **AC1**: the scenario openqa_from_bootstrap is consistently stable with fail ratio < 0.1% with inner openQA job not incomplete 

 ## Suggestions 
 * Run the inner example tests in a VM *DONE* didn't reproduce it 
 * Try to run the openqa-in-openqa test, needing nested virt or emulating 
 * Change the test code to schedule the inner test multiple times instead of once and run openqa-in-openqa many times and see if it makes a difference. Is it more often reproducible? Does it only fail for the first inner test? 
 * Run openqa-in-openqa test and pause the test and schedule the inner test manually multiple times 
 * Look for https://github.com/os-autoinst/os-autoinst/blob/master/script/isotovideo#L151 "isotovideo failed' from uploaded logs from the inner openQA job to detect the problem

Back