Actions
action #73288
opencoordination #102906: [saga][epic] Increased stability of tests with less "known failures", known incompletes handled automatically within openQA
coordination #102909: [epic] Prevent more incompletes already within os-autoinst or openQA
auto_review:"setup failure: Cache service status error from API: Minion job.*Job terminated unexpectedly":retry
Status:
Workable
Priority:
Low
Assignee:
-
Category:
Feature requests
Target version:
QA (public, currently private due to #173521) - future
Start date:
2020-10-13
Due date:
% Done:
0%
Estimated time:
Description
Observation¶
from https://openqa.suse.de/tests/4812998
[2020-10-13T10:52:32.0254 CEST] [info] [pid:36934] Download of sle-15-SP1-x86_64-4.12.14-1063.1.ga210de1-Server-DVD-Incidents-Kernel-KOTD@64bit-with-ltp.qcow2 processed:
[info] [#43340] Cache size of "/var/lib/openqa/cache" is 50GiB, with limit 50GiB
[info] [#43340] Downloading "sle-15-SP1-x86_64-4.12.14-1063.1.ga210de1-Server-DVD-Incidents-Kernel-KOTD@64bit-with-ltp.qcow2" from "http://openqa.suse.de/tests/4812991/asset/hdd/sle-15-SP1-x86_64-4.12.14-1063.1.ga210de1-Server-DVD-Incidents-Kernel-KOTD@64bit-with-ltp.qcow2"
[info] [#43340] Cache size 50GiB + needed 2GiB exceeds limit of 50GiB, purging least used assets
[info] [#43340] Purging "/var/lib/openqa/cache/openqa.suse.de/sle-15-SP1-x86_64-Build20201013-1-HA-QAM.qcow2" because we need space for new assets, reclaiming 1.1GiB
[info] [#43340] Purging "/var/lib/openqa/cache/openqa.suse.de/SLES-15-SP3-x86_64-Build56.1@64bit-gnome.qcow2" because we need space for new assets, reclaiming 1.6GiB
[info] [#43340] Size of "/var/lib/openqa/cache/openqa.suse.de/sle-15-SP1-x86_64-4.12.14-1063.1.ga210de1-Server-DVD-Incidents-Kernel-KOTD@64bit-with-ltp.qcow2" is 2GiB, with ETag ""81ad0000-5b183f589249e""
[info] [#43340] Download of "/var/lib/openqa/cache/openqa.suse.de/sle-15-SP1-x86_64-4.12.14-1063.1.ga210de1-Server-DVD-Incidents-Kernel-KOTD@64bit-with-ltp.qcow2" successful, new cache size is 49GiB
[2020-10-13T10:52:32.0261 CEST] [debug] [pid:36934] Found ISO, caching SLE-15-SP1-Installer-DVD-x86_64-GM-DVD1.iso
[2020-10-13T10:52:32.0263 CEST] [info] [pid:36934] Downloading SLE-15-SP1-Installer-DVD-x86_64-GM-DVD1.iso, request #43343 sent to Cache Service
[2020-10-13T10:52:37.0314 CEST] [info] [pid:36934] Download of SLE-15-SP1-Installer-DVD-x86_64-GM-DVD1.iso processed:
[info] [#43343] Cache size of "/var/lib/openqa/cache" is 49GiB, with limit 50GiB
[info] [#43343] Downloading "SLE-15-SP1-Installer-DVD-x86_64-GM-DVD1.iso" from "http://openqa.suse.de/tests/4812998/asset/iso/SLE-15-SP1-Installer-DVD-x86_64-GM-DVD1.iso"
[info] [#43343] Content of "/var/lib/openqa/cache/openqa.suse.de/SLE-15-SP1-Installer-DVD-x86_64-GM-DVD1.iso" has not changed, updating last use
[2020-10-13T10:52:37.0318 CEST] [info] [pid:36934] Rsync from 'rsync://openqa.suse.de/tests' to '/var/lib/openqa/cache/openqa.suse.de', request #43346 sent to Cache Service
with worker-log.txt:
[2020-10-13T10:52:32.0256 CEST] [debug] [pid:36934] Linked asset "/var/lib/openqa/cache/openqa.suse.de/sle-15-SP1-x86_64-4.12.14-1063.1.ga210de1-Server-DVD-Incidents-Kernel-KOTD@64bit-with-ltp.qcow2" to "/var/lib/openqa/pool/16/sle-15-SP1-x86_64-4.12.14-1063.1.ga210de1-Server-DVD-Incidents-Kernel-KOTD@64bit-with-ltp.qcow2"
[2020-10-13T10:52:37.0270 CEST] [debug] [pid:36934] Updating status so job 4812998 is not considered dead.
[2020-10-13T10:52:37.0271 CEST] [debug] [pid:36934] REST-API call: POST http://openqa.suse.de/api/v1/jobs/4812998/status
[2020-10-13T10:52:37.0315 CEST] [debug] [pid:36934] Linked asset "/var/lib/openqa/cache/openqa.suse.de/SLE-15-SP1-Installer-DVD-x86_64-GM-DVD1.iso" to "/var/lib/openqa/pool/16/SLE-15-SP1-Installer-DVD-x86_64-GM-DVD1.iso"
[2020-10-13T10:52:42.0326 CEST] [debug] [pid:36934] Updating status so job 4812998 is not considered dead.
[2020-10-13T10:52:42.0326 CEST] [debug] [pid:36934] REST-API call: POST http://openqa.suse.de/api/v1/jobs/4812998/status
[2020-10-13T10:52:47.0377 CEST] [debug] [pid:36934] Updating status so job 4812998 is not considered dead.
[2020-10-13T10:52:47.0378 CEST] [debug] [pid:36934] REST-API call: POST http://openqa.suse.de/api/v1/jobs/4812998/status
[2020-10-13T10:52:47.0408 CEST] [error] [pid:36934] Unable to setup job 4812998: Cache service status error from API: Minion job #43346 failed: Job terminated unexpectedly (exit code: 2, signal: 0)
[2020-10-13T10:52:47.0408 CEST] [debug] [pid:36934] Stopping job 4812998 from openqa.suse.de: 04812998-sle-15-SP1-Server-DVD-Incidents-Kernel-KOTD-x86_64-Build4.12.14-1063.1.ga210de1-ltp_sched@64bit - reason: setup failure
[2020-10-13T10:52:47.0408 CEST] [debug] [pid:36934] REST-API call: POST http://openqa.suse.de/api/v1/jobs/4812998/status
[2020-10-13T10:52:47.0467 CEST] [info] [pid:26893] Uploading autoinst-log.txt
[2020-10-13T10:52:47.0502 CEST] [info] [pid:26893] Uploading worker-log.txt
Acceptance criteria¶
- AC1: Either job is automatically restarted or the user receives a clear message what the user did wrong
Suggestions¶
- Improve log messages
- Maybe the worker service restarted but this did not retrigger the job. Better restart openQA jobs automatically when this happens
Workaround¶
Retrigger job
Actions