Actions
action #19564
closed[tools]worker is unresponsive for three days but reports as online to the webui because of cache database locked?
Status:
Closed
Priority:
Normal
Assignee:
-
Category:
Regressions/Crashes
Target version:
-
Start date:
2017-06-04
Due date:
% Done:
0%
Estimated time:
Description
observation¶
No jobs are executed on worker class "zkvm-image", i.e. openqaworker2:7.
journal output on that worker:
Jun 01 22:32:13 openqaworker2 worker[1662]: [INFO] OpenQA::Worker::Cache: Initialized with http://openqa.suse.de at /var/lib/openqa/cache, current size is 47971230641
Jun 01 22:32:13 openqaworker2 worker[1662]: [DEBUG] Found HDD_1, caching sle-12-SP1-Server-DVD-s390x-sdk+allpatterns.qcow2
Jun 01 22:32:13 openqaworker2 worker[1662]: [DEBUG] CACHE: Aquiring lock for /var/lib/openqa/cache/sle-12-SP1-Server-DVD-s390x-sdk+allpatterns.qcow2 in the database
Jun 01 22:32:19 openqaworker2 worker[1662]: [DEBUG] Update status so job is not considered dead.
Jun 01 22:32:39 openqaworker2 worker[1662]: [DEBUG] Update status so job is not considered dead.
Jun 01 22:34:32 openqaworker2 worker[1662]: [DEBUG] Update status so job is not considered dead.
Jun 01 22:35:02 openqaworker2 worker[1662]: DBD::SQLite::st execute failed: database is locked at /usr/share/openqa/script/../lib/OpenQA/Worker/Cache.pm line 212.
Jun 01 22:35:02 openqaworker2 worker[1662]: [ERROR] toggle_asset_lock: Rolling back DBD::SQLite::st execute failed: database is locked at /usr/share/openqa/script/../lib/OpenQA/Worker/Cache.pm line 212.
Jun 01 22:35:02 openqaworker2 worker[1662]: rollback ineffective with AutoCommit enabled at /usr/share/openqa/script/../lib/OpenQA/Worker/Cache.pm line 216.
Jun 01 22:35:07 openqaworker2 worker[1662]: [WARN] job is missing files, releasing job
Jun 01 22:35:07 openqaworker2 worker[1662]: [ERROR] 404 response: Not Found (remaining tries: 0)
Jun 01 22:35:07 openqaworker2 worker[1662]: [ERROR] ERROR autoinst-log.txt: 404 response: Not Found
Jun 01 22:35:07 openqaworker2 worker[1662]: [DEBUG] Either there is no job running or we were asked to stop: (1|Reason: api-failure)
Jun 01 22:35:07 openqaworker2 worker[1662]: [INFO] cleaning up 00975813-sle-12-SP3-Server-DVD-s390x-Build0409-om_proxyscc_sles12sp1_sdk+allpatterns_full_update_by_yast_s390x@zkvm-image
Jun 01 22:35:07 openqaworker2 worker[1662]: [INFO] got job 975814: 00975814-sle-12-SP3-Server-DVD-s390x-Build0409-om_proxyscc_sles12sp2_allpatterns_full_update_by_zypper_s390x@zkvm-image
Jun 01 22:35:37 openqaworker2 worker[1662]: DBD::SQLite::db prepare failed: database is locked at /usr/share/openqa/script/../lib/OpenQA/Worker/Cache.pm line 345.
Jun 01 22:35:37 openqaworker2 worker[1662]: Mojo::Reactor::Poll: Timer failed: DBD::SQLite::db prepare failed: database is locked at /usr/share/openqa/script/../lib/OpenQA/Worker/Cache.pm line 345.
lines 243531-243591/243591 (END)
reproducible¶
TBD
problem¶
Is the locked database a symptom or the problem causing the worker to be stuck?
This is the output when stopping the worker
Jun 04 08:34:09 openqaworker2 systemd[1]: Stopping openQA Worker #7...
Jun 04 08:34:09 openqaworker2 worker[1662]: [ERROR] 404 response: Not Found (remaining tries: 0)
Jun 04 08:34:09 openqaworker2 worker[1662]: [INFO] registering worker with openQA http://openqa.suse.de...
Jun 04 08:34:09 openqaworker2 worker[1662]: [INFO] quit due to signal TERM
Jun 04 08:34:09 openqaworker2 worker[1662]: [DEBUG] Either there is no job running or we were asked to stop: (1|Reason: api-failure)
Jun 04 08:34:09 openqaworker2 worker[1662]: [DEBUG] duplicating job 975814
Jun 04 08:34:09 openqaworker2 worker[1662]: [INFO] cleaning up 00975814-sle-12-SP3-Server-DVD-s390x-Build0409-om_proxyscc_sles12sp2_allpatterns_full_update_by_zypper_s390x@zkvm-image
Jun 04 08:34:09 openqaworker2 systemd[1]: Stopped openQA Worker #7.
Why the "[ERROR] 404 response: Not Found" when trying to stop the worker?
When there is no output of "[DEBUG] Update status so job is not considered dead." in the logfile for three days shouldn't the worker be considered dead by the scheduler?
workaround¶
Restart the worker
Actions