Project

General

Profile

Actions

action #42980

closed

job stayed in assigned but is dead

Added by coolo over 5 years ago. Updated about 5 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
-
Start date:
2018-10-27
Due date:
% Done:

0%

Estimated time:

Description

https://openqa.suse.de/tests/2214971 is assigned - for hours now

the worker is active and was seen 1 minute ago, but there is no
sign of this job - but the worker is in a strange state:

Oct 26 18:48:27 openqaworker2 worker[8606]: [info] 881: WORKING 2213457
Oct 26 20:30:08 openqaworker2 worker[8606]: [error] 400 response: Bad Request (remaining tries: 2)
Oct 26 20:30:09 openqaworker2 worker[8606]: [info] Isotovideo exit status: 1
Oct 26 20:30:13 openqaworker2 worker[8606]: [error] 400 response: Bad Request (remaining tries: 1)
Oct 26 20:30:18 openqaworker2 worker[8606]: [error] 400 response: Bad Request (remaining tries: 0)
Oct 26 20:30:18 openqaworker2 worker[8606]: [error] Job aborted because web UI doesn't accept updates anymore (likely considers this job dead)
Oct 26 20:30:18 openqaworker2 worker[8606]: [error] 400 response: Bad Request (remaining tries: 2)
Oct 26 20:30:23 openqaworker2 worker[8606]: [error] 400 response: Bad Request (remaining tries: 1)
Oct 26 20:30:28 openqaworker2 worker[8606]: [info] Collected unknown process with pid 31056 and exit status: 0
Oct 26 20:30:28 openqaworker2 worker[8606]: [info] registering worker openqaworker2 version 13 with openQA openqa.suse.de using protocol version [1]
Oct 26 20:30:28 openqaworker2 worker[8606]: [error] 400 response: Bad Request (remaining tries: 0)
Oct 26 20:30:28 openqaworker2 worker[8606]: [info] +++ worker notes +++
Oct 26 20:30:28 openqaworker2 worker[8606]: [info] end time: 2018-10-26 18:30:28
Oct 26 20:30:28 openqaworker2 worker[8606]: [info] result: died
Oct 26 20:30:28 openqaworker2 worker[8606]: [info] uploading consoletest_setup-loadavg_consoletest_setup.txt
Oct 26 20:30:28 openqaworker2 worker[8606]: [error] ERROR consoletest_setup-loadavg_consoletest_setup.txt: 404 response: Not Found
Oct 26 20:30:28 openqaworker2 worker[8606]: [info] uploading video.ogv
Oct 26 20:30:29 openqaworker2 worker[8606]: [error] ERROR video.ogv: 404 response: Not Found
Oct 26 20:30:29 openqaworker2 worker[8606]: can't open /var/lib/openqa/pool/23/testresults/result-cleanup_before_shutdown.json: No such file or directory at /usr/share/openqa/script/../lib/OpenQA/Worker/Jobs.pm line 900.
Oct 26 20:30:29 openqaworker2 worker[8606]: [error] 400 response: Bad Request (remaining tries: 2)
Oct 26 20:30:34 openqaworker2 worker[8606]: [error] 400 response: Bad Request (remaining tries: 1)
Oct 26 20:30:38 openqaworker2 worker[8606]: [info] registering worker openqaworker2 version 13 with openQA openqa.suse.de using protocol version [1]
Oct 26 20:30:39 openqaworker2 worker[8606]: [error] 400 response: Bad Request (remaining tries: 0)
Oct 26 20:30:39 openqaworker2 worker[8606]: [error] Job aborted because web UI doesn't accept updates anymore (likely considers this job dead)
Oct 26 20:30:49 openqaworker2 worker[8606]: [info] Collected unknown process with pid 31160 and exit status: 0
Oct 26 20:30:49 openqaworker2 worker[8606]: [info] registering worker openqaworker2 version 13 with openQA openqa.suse.de using protocol version [1]
Oct 26 21:53:06 openqaworker2 worker[8606]: [warn] Received command cancel for different job id 2214393 (our 2213457). Ignoring!
Oct 27 01:20:11 openqaworker2 worker[8606]: [warn] Received command cancel for different job id 2213548 (our 2213457). Ignoring!

Related issues 2 (0 open2 closed)

Related to openQA Project - coordination #47117: [epic] Fix worker->websocket->scheduler->webui connectionResolvedokurz2019-02-04

Actions
Has duplicate openQA Project - action #48422: Workers stay in inconsistent job relationshipResolvedmkittler2019-02-26

Actions
Actions #1

Updated by mkittler over 5 years ago

Maybe that strange state is similar to https://progress.opensuse.org/issues/40004 and has already been improved by https://github.com/os-autoinst/openQA/commit/3d1e3f38139519d5150cdd6a7e9103823e81488c. It would be interesting to know whether that commit has already been deployed at that time on this worker.

Actions #2

Updated by coolo about 5 years ago

  • Related to coordination #47117: [epic] Fix worker->websocket->scheduler->webui connection added
Actions #3

Updated by mkittler about 5 years ago

  • Has duplicate action #48422: Workers stay in inconsistent job relationship added
Actions #4

Updated by mkittler about 5 years ago

  • Status changed from New to Rejected
  • Assignee set to mkittler
  • Target version deleted (Ready)

This issue has been duplicated by https://progress.opensuse.org/issues/48422. Considering the log it is really just the same. Since the other ticket has more details in the comments I'm closing this one.

Actions

Also available in: Atom PDF