Project

General

Profile

action #44162

Various tests stayed 'running' for ~ 4 hours or longer

Added by dimstar almost 4 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Concrete Bugs
Target version:
-
Start date:
2018-11-21
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Sample test:

https://openqa.opensuse.org/tests/801664 (raid10/TW) and https://openqa.opensuse.org/tests/801657 (cryptlvm/leap15.1)

it was a clone of an earlier one, that finished incomplete.

The restarted job staid around for 4 hours, but did not really make progress (and the test usually finishes much quicker. normaly runtime of RAID10/TW is 30 - 45 minutes)


Related issues

Related to openQA Project - action #44105: if workercache dies, we get *tons* of incompletesResolved2018-11-21

History

#1 Updated by okurz almost 4 years ago

  • Related to action #44105: if workercache dies, we get *tons* of incompletes added

#2 Updated by okurz over 3 years ago

  • Subject changed from Various tests staid 'running' for ~ 4 hours to Various tests stayed 'running' for ~ 4 hours

#3 Updated by okurz over 3 years ago

  • Category set to Concrete Bugs

#4 Updated by okurz about 3 years ago

  • Subject changed from Various tests stayed 'running' for ~ 4 hours to Various tests stayed 'running' for ~ 4 hours or longer

Let me hijack this ticket to reference most recent examples on OSD which run for > 6 days (!):

All of them have been cloned to newer jobs already and they do not even block the workers anymore as the assigned worker already executed other jobs just fine. Cancelling the job over web UI does not work, restarting the worker instance systemd job also not successful. A manual deletion of the job does work but I haven't executed that on theabove jobs, just an older one last week.

#5 Updated by okurz over 2 years ago

  • Status changed from New to Resolved
  • Assignee set to okurz

Since then we have improved stall detection, worker handling, refactored the worker-websockets-webui connection. Haven't observed this lately so I think we have it actually covered.

Also available in: Atom PDF