Project

General

Profile

Actions

action #40004

closed

worker continues to work on job which he as well as the webui considers dead

Added by okurz over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2018-08-20
Due date:
% Done:

0%

Estimated time:

Description

Observation

From o3:/var/log/openqa

[2018-08-20T09:17:48.0733 UTC] [info] Got artefact for job with no worker assigned (maybe running job already considered dead): 738408
[2018-08-20T09:17:48.0770 UTC] [info] Got artefact for job with no worker assigned (maybe running job already considered dead): 738408
[2018-08-20T09:17:49.0404 UTC] [info] Got artefact for job with no worker assigned (maybe running job already considered dead): 738409
[2018-08-20T09:17:49.0492 UTC] [info] Got artefact for job with no worker assigned (maybe running job already considered dead): 738409
[2018-08-20T09:17:50.0164 UTC] [info] Got artefact for job with no worker assigned (maybe running job already considered dead): 738408
[2018-08-20T09:17:50.0203 UTC] [info] Got artefact for job with no worker assigned (maybe running job already considered dead): 738408
[2018-08-20T09:17:50.0524 UTC] [info] Got artefact for job with no worker assigned (maybe running job already considered dead): 738408
[2018-08-20T09:17:50.0558 UTC] [info] Got artefact for job with no worker assigned (maybe running job already considered dead): 738408

https://openqa.opensuse.org/tests/738408 reveals it was openqaworker4:11 which shows in its log from the same time period:

Aug 20 11:07:03 openqaworker4 worker[30027]: [error] Job aborted because web UI doesn't accept new images anymore (likely considers this job dead)
Aug 20 11:07:33 openqaworker4 worker[30027]: [error] Job aborted because web UI doesn't accept new images anymore (likely considers this job dead)
Aug 20 11:07:53 openqaworker4 worker[30027]: [error] Job aborted because web UI doesn't accept new images anymore (likely considers this job dead)
Aug 20 11:08:03 openqaworker4 worker[30027]: [error] Job aborted because web UI doesn't accept new images anymore (likely considers this job dead)
Aug 20 11:09:17 openqaworker4 worker[30027]: [error] Job aborted because web UI doesn't accept new images anymore (likely considers this job dead)

Problem

It looks like both webui and worker agree that the job should not be worked on but the worker still does not stop. What gives?


Related issues 2 (0 open2 closed)

Related to openQA Project (public) - action #39743: [o3][tools] o3 unusable, often responds with 504 Gateway Time-outResolvedokurz2018-08-15

Actions
Related to openQA Project (public) - action #39833: [tools] When a worker is abruptly killed, jobs get blocked - CACHE: Being downloaded by another worker, sleepingResolvedEDiGiacinto2018-08-16

Actions
Actions

Also available in: Atom PDF