Actions
action #59858
closed"Migrate to file failed, it has been running for more than 240 at /usr/lib/os-autoinst/backend/qemu.pm line 260." broken NVMe on openqaworker13, jobs incomplete trying to save snapshots
Start date:
2019-11-14
Due date:
% Done:
0%
Estimated time:
Description
Observation¶
https://openqa.suse.de/tests/3594947/file/autoinst-log.txt on openqaworker13 shows:
[2019-11-14T20:32:28.969 CET] [debug] EVENT {"data":{"status":"active"},"event":"MIGRATION","timestamp":{"microseconds":476972,"seconds":1573759948}}
[2019-11-14T20:32:28.969 CET] [debug] Migrating total bytes: 1078796288
…
[2019-11-14T20:36:28.601 CET] [debug] Migrating total bytes: 1078796288
[2019-11-14T20:36:28.601 CET] [debug] Migrating remaining bytes: 337879040
[2019-11-14T20:36:28.602 CET] [debug] EVENT {"data":{"status":"cancelling"},"event":"MIGRATION","timestamp":{"microseconds":602550,"seconds":1573760188}}
[2019-11-14T20:36:28.603 CET] [debug] Backend process died, backend errors are reported below in the following lines:
Migrate to file failed, it has been running for more than 240 at /usr/lib/os-autoinst/backend/qemu.pm line 260.
From dmesg on openqaworker13 I can see:
[Thu Nov 14 21:03:15 2019] nvme nvme0: Abort status: 0x0
…
[Thu Nov 14 21:03:15 2019] nvme nvme0: Abort status: 0x0
[Thu Nov 14 21:03:24 2019] nvme nvme0: I/O 451 QID 3 timeout, aborting
…
[Thu Nov 14 21:03:24 2019] nvme nvme0: I/O 458 QID 3 timeout, aborting
[Thu Nov 14 21:03:28 2019] nvme nvme0: Abort status: 0x0
…
[Thu Nov 14 21:03:28 2019] nvme nvme0: Abort status: 0x0
so it looks like the NVMe device is used but not providing expected performance or being faulty.
Actions