action #58379
closed
isotovideo is slow to shutdown / error messages on proper shutdown
Added by okurz about 5 years ago.
Updated over 4 years ago.
Category:
Feature requests
Description
Observation¶
The isotovideo shutdown is far from speedy, e.g. see
[2019-10-17T23:10:23.857 CEST] [debug] no match: 79.0s, best candidate: grub2-2-20131030 (0.00)
^C[2019-10-17T23:10:24.058 CEST] [debug] signalhandler got INT
can_read received kill signal at ../../myjsonrpc.pm line 91.
[2019-10-17T23:10:24.059 CEST] [debug] killing backend process 10637
[2019-10-17T23:10:24.549 CEST] [debug] WARNING: check_asserted_screen took 0.68 seconds for 32 candidate needles - make your needles more specific
[2019-10-17T23:10:24.549 CEST] [debug] no match: 77.8s, best candidate: grub2-2-20131030 (0.00)
[2019-10-17T23:10:24.575 CEST] [debug] QEMU: qemu-system-x86_64: terminating on signal 2
[2019-10-17T23:10:24.576 CEST] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
[2019-10-17T23:10:24.581 CEST] [debug] sending magic and exit
Unexpected end of data 0
[2019-10-17T23:10:26.062 CEST] [debug] done with backend process
[2019-10-17T23:10:26.062 CEST] [debug] killing command server 10627 because test execution ended through exception
[2019-10-17T23:10:28.063 CEST] [debug] done with command server
[2019-10-17T23:10:28.063 CEST] [debug] killing autotest process 10630
[2019-10-17T23:10:30.063 CEST] [debug] done with autotest process
10623: EXIT 0
This is from a manual isotovideo run.
6s just to kill some processes. I think we can still improve this. Also the "Unexpected end of data 0" isn't that nice. The "end of data" should very much be expected when the processes are asked to shutdown :)
- Copied from action #57707: isotovideo fails to terminate cleanly, message "isotovideo: unable to inform websocket clients about stopping command server: Request timeout", regression from 4cd4af2b added
- Priority changed from Normal to Low
- Status changed from New to In Progress
- Assignee set to okurz
- Related to action #53999: openqa-worker (isotovideo) dumps core / segfaults quite often on several workers and distributions for cancelled jobs added
- Target version set to Current Sprint
- Status changed from In Progress to Workable
- Assignee deleted (
okurz)
- Target version deleted (
Current Sprint)
- Status changed from Workable to Feedback
- Assignee set to okurz
- Due date set to 2020-04-14
- Status changed from Feedback to Blocked
- Status changed from Blocked to Resolved
PR was merged and a new release 0.25 was published which is by now also published as packages within openSUSE:Factory as well as our derived branch projects in devel:openQA. Also deployed by now on osd and o3 workers. No problems reported. Taking any example job, e.g. https://openqa.suse.de/tests/4125075/file/autoinst-log.txt I can find:
[2020-04-15T14:54:13.264 CEST] [debug] EVENT {"event":"STOP","timestamp":{"microseconds":657695,"seconds":1586955252}}
[2020-04-15T14:54:13.267 CEST] [debug] ||| finished shutdown_ltp kernel at 2020-04-15 12:54:13 (35 s)
[2020-04-15T14:54:13.269 CEST] [debug] stopping autotest process 18989
[2020-04-15T14:54:13.276 CEST] [debug] [autotest] process exited: 0
[2020-04-15T14:54:13.276 CEST] [debug] done with autotest process
[2020-04-15T14:54:13.276 CEST] [debug] terminating command server 18898 because test execution ended
[2020-04-15T14:54:13.276 CEST] [debug] isotovideo: informing websocket clients before stopping command server: http://127.0.0.1:20283/ttVvf5JAFkA9USqT/broadcast
[2020-04-15T14:54:13.286 CEST] [debug] commands process exited: 0
[2020-04-15T14:54:13.287 CEST] [debug] done with command server
[2020-04-15T14:54:13.287 CEST] [debug] isotovideo done
[2020-04-15T14:54:13.287 CEST] [debug] BACKEND SHUTDOWN 1
[2020-04-15T14:54:13.288 CEST] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
[2020-04-15T14:54:14.317 CEST] [debug] flushing frames
[2020-04-15T14:54:14.352 CEST] [debug] QEMU: qemu-system-x86_64: terminating on signal 15 from pid 19005 (/usr/bin/isotovideo: backen)
[2020-04-15T14:54:14.353 CEST] [debug] sending magic and exit
[2020-04-15T14:54:14.353 CEST] [debug] received magic close
[2020-04-15T14:54:14.375 CEST] [debug] backend process exited: 0
[2020-04-15T14:54:14.376 CEST] [debug] stopping backend process 19005
[2020-04-15T14:54:14.376 CEST] [debug] done with backend process
18895: EXIT 0
[2020-04-15T14:54:14.0451 CEST] [info] [pid:25970] Isotovideo exit status: 0
That is a shutdown within 800ms compared to previously about 6s \o/
- Related to coordination #65118: [epic] multimachine test fails with symptoms "websocket refusing connection" and other unclear reasons added
Also available in: Atom
PDF