openqa workers on
ip-172-25-5-39 fails with no clue on the obvious reason of the failure. This host is an aarch64 AWS A1 instance with 3 workers running on SLE15-SP1.
One occurrence of the failure: https://openqa.opensuse.org/tests/1660849
[0mGOT GO [37m[2021-03-08T10:18:37.703 UTC] [debug] THERE IS NOTHING TO READ 4 5 4 [0mmyjsonrpc: remote end terminated connection, stopping at /usr/lib/os-autoinst/myjsonrpc.pm line 57, <$fh> line 78. [37m[2021-03-08T10:18:37.703 UTC] [debug] stopping backend process 2994 [0m[37m[2021-03-08T10:18:37.705 UTC] [debug] backend got TERM [0m[33m[2021-03-08T10:18:37.705 UTC] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json [0m[37m[2021-03-08T10:18:38.714 UTC] [debug] flushing frames [0m[37m[2021-03-08T10:18:38.836 UTC] [debug] QEMU: QEMU emulator version 188.8.131.52 (SUSE Linux Enterprise 15) [0m[37m[2021-03-08T10:18:38.836 UTC] [debug] QEMU: Copyright (c) 2003-2018 Fabrice Bellard and the QEMU Project developers [0m[37m[2021-03-08T10:18:38.836 UTC] [debug] QEMU: qemu-system-aarch64: terminating on signal 15 from pid 2994 (/usr/bin/isotovideo: backen) [0m[37m[2021-03-08T10:18:38.838 UTC] [debug] sending magic and exit [0m[37m[2021-03-08T10:18:39.005 UTC] [debug] done with backend process [0m[37m[2021-03-08T10:18:39.005 UTC] [debug] stopping autotest process 2978 [0m[37m[2021-03-08T10:18:39.206 UTC] [debug] done with autotest process [0m2968: EXIT 0
For now I disabled the workers on this machine to avoid lots of test failures.
- Due date set to 2021-03-16
- Category set to Support
- Status changed from New to Feedback
- Assignee set to okurz
- Target version set to Ready
- Parent task set to #62420
https://openqa.opensuse.org/admin/workers/270 looks like this is reproducible, right? Could you try to run a test locally with just isotovideo and a vars.json file?
We also have some other cases in #62420 regarding "we do not understand from logs what is going on".
Please also keep in mind that packages on SLE15-SP1 - if they are even up-to-date - are not properly tested and likely to have some problems.
- Assignee changed from okurz to ggardet_arm
- Target version changed from Ready to future
That's a step forward. I hope you understand that so far we heavily rely on the special environment where this happens because so far I am unaware of any way to reproduce the same issue elsewhere.
I think what could be improved from backend side though is the logging and also the tracking of subprocesses to improve what people can understand from the not so clear text flow in this section:
[0mGOT GO [37m[2021-03-08T10:18:37.703 UTC] [debug] THERE IS NOTHING TO READ 4 5 4 [0mmyjsonrpc: remote end terminated connection, stopping at /usr/lib/os-autoinst/myjsonrpc.pm line 57, <$fh> line 78.
But again the main problem I see here is that you are using SLE15-SP1 which is unfortunately unsupported. Could you switch to openSUSE Leap 15.2 for the complete system or run the worker in container of either openSUSE Leap 15.2 or openSUSE Tumbleweed? As an alternative if you are interested in improving the support for SLE15-SP1 yourself then look into the unresolvables on https://build.opensuse.org/project/monitor/devel:openQA?arch_aarch64=1&blocked=1&broken=1&building=1&defaults=0&deleting=1&dispatching=1&failed=1&finished=1&locked=1&repo_SLE_15_SP1=1&scheduled=1&signing=1&succeeded=1&unresolvable=1