action #117352
closedOBS build fails in t/29-backend-generalhw.t size:M
Description
Observation¶
Package devel:openQA/os-autoinst failed to build in openSUSE_Leap_15.4/aarch64
Check out the package for editing:
osc checkout devel:openQA os-autoinst
Last lines of build log:
[ 556s] 3: [20:58:54] ./t/29-backend-generalhw.t .................
[ 556s] 3: ok 1 - can check socket
[ 556s] 3: # Subtest: start VM
[ 556s] 3: ok 1 - return value
[ 556s] 3: ok 2 - poweroff/on commands invoked
[ 556s] 3: ok 3 - tried to connect to VNC server
[ 556s] 3: 1..3
[ 556s] 3: ok 2 - start VM
[ 556s] 3: # Subtest: start VM with video
[ 556s] 3: ok 1 - return value
[ 556s] 3: ok 2 - poweroff/on commands invoked
[ 556s] 3: ok 3 - tried to connect to video stream
[ 556s] 3: 1..3
[ 556s] 3: ok 3 - start VM with video
[ 556s] 3: # Subtest: hdd args
[ 556s] 3: ok 1 - return value
[ 556s] 3: 1..1
[ 556s] 3: ok 4 - hdd args
[ 556s] 3: # Subtest: stop VM
[ 556s] 3: ok 1 - return value
[ 556s] 3: ok 2 - poweroff/on commands invoked
[ 556s] 3: 1..2
[ 556s] 3: ok 5 - stop VM
[ 556s] 3: # Subtest: error handling
[ 556s] 3: ok 1 - IPC error thrown with context
[ 556s] 3: ok 2 - error when GENERAL_HW_CMD_DIR is not a directory
[ 556s] 3: ok 3 - WORKER_HOSTNAME required
[ 556s] 3: 1..3
[ 556s] 3: ok 6 - error handling
[ 556s] 3: # Subtest: handling power commands
[ 556s] 3: ok 1 - power commands invoked
[ 556s] 3: ok 2 - dies on invalid action
[ 556s] 3: 1..2
[ 556s] 3: ok 7 - handling power commands
[ 556s] 3: # Subtest: re-login VNC
[ 556s] 3: ok 1 - re-login has truthy return code
[ 556s] 3: ok 2 - VNC base console assigned
[ 556s] 3: ok 3 - previously assigned VNC socket closed
[ 556s] 3: 1..3
[ 556s] 3: ok 8 - re-login VNC
[ 556s] 3: # Subtest: serial grab
[ 556s] 3: # Subtest: capturing output
[ 556s] 3: ok 1 - serial PID assigned: 4324
[ 556s] 3: ok 2 - serial output captured
[ 556s] 3: 1..2
[ 556s] 3: ok 1 - capturing output
[ 556s] 3: # Subtest: stop grabbing
[ 556s] 3: 1..0
[ 556s] 3: not ok 2 - No tests run for subtest "stop grabbing"
[ 556s] 3: # No tests run!
[ 556s] 3:
[ 556s] 3: # Failed test 'No tests run for subtest "stop grabbing"'
[ 556s] 3: # at ./t/29-backend-generalhw.t line 174.
[ 556s] 3: 1..2
[ 556s] 3: not ok 9 - serial grab
[ 556s] 3: ok 10 - no (unexpected) warnings (via END block)
[ 556s] 3: # Looks like you failed 1 test of 2.
[ 556s] 3:
[ 556s] 3: # Failed test 'serial grab'
[ 556s] 3: # at ./t/29-backend-generalhw.t line 177.
[ 556s] 3: Can't kill('-TERM', '4325'): No such process at /home/abuild/rpmbuild/BUILD/os-autoinst-4.6.1664377866.1f6d57e/backend/generalhw.pm line 186
[ 556s] 3: # Tests were run but no plan was declared and done_testing() was not seen.
[ 556s] 3: # Looks like your test exited with 255 just after 10.
[ 556s] 3: Dubious, test returned 255 (wstat 65280, 0xff00)
[ 556s] 3: Failed 1/10 subtests
...
[ 608s] 3:
[ 608s] 3: Test Summary Report
[ 608s] 3: -------------------
[ 608s] 3: ./t/29-backend-generalhw.t (Wstat: 65280 Tests: 10 Failed: 1)
[ 608s] 3: Failed test: 9
[ 608s] 3: Non-zero exit status: 255
[ 608s] 3: Parse errors: No plan found in TAP output
[ 608s] 3: Files=59, Tests=1266, 373 wallclock secs ( 1.73 usr 0.82 sys + 319.81 cusr 51.16 csys = 373.52 CPU)
[ 608s] 3: Result: FAIL
[29409s] qemu-system-aarch64: terminating on signal 15 from pid 27497 ()
That's not the first occurrence: https://progress.opensuse.org/issues/111254#note-14
Suggestions¶
- Disable the subtest in OBS
- Try and reproduce this locally
- ~Confirm that this is arm-specific (we suspect it's not)~
- Double-check behavior of spawning and killing a process e.g. from this unit test, maybe it doesn't "sleep" properly
Updated by livdywan about 2 years ago
- Subject changed from OBS ARM build fails in t/29-backend-generalhw.t to OBS build fails in t/29-backend-generalhw.t size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by mkittler almost 2 years ago
- Status changed from Workable to In Progress
Let's exclude the test for now: https://github.com/os-autoinst/os-autoinst/pull/2189
Updated by openqa_review almost 2 years ago
- Due date set to 2022-10-20
Setting due date based on mean cycle time of SUSE QE Tools
Updated by mkittler almost 2 years ago
- Status changed from In Progress to Feedback
I'm not sure how to fix this test as I fail to see the problem. Maybe someone else from the team can have a look at the test (and the code being tested)? It is actually not really complicated.
Note that I tested locally what happens when the sleep
is missing (so the command terminates immediately). Then one actually does not run into the error shown in the ticket description as the process remains around as a zombi until it has been waited for. That is easily observable by putting a sleep (or actually a long loop because sleep is mocked in that test) before the stop function. So I suppose it cannot be that sleep
isn't invoked correctly.
Since generalhw.pm
has use autodie ':all';
the lack of explicit error handling in start_serial_grab
and stop_serial_grab
should actually be ok. So if forking the sub process would not have worked we should have seen an error message and not have a plausible PID for the forked process being logged. In fact, that error handling is what causes the test to fail.
Updated by mkittler almost 2 years ago
- Status changed from Feedback to Resolved
It doesn't look like anybody else wants to have a look and I'm not sure what's the problem. So I'm resolving the issue by just keeping the test disabled in OBS. (I haven't seen any failures in GitHub actions.)