Project

General

Profile

action #44693

Caching issue on new snapshots synced to o3 - no cache minion workers available

Added by dimstar over 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Concrete Bugs
Target version:
-
Start date:
2018-12-04
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

In the night of Dec 3/4, Snapshot 1203 was synced to openQA and jobs created for execution.

A large number of jobs (roughly 70) seems to have marked 'incomplete' within minutes, as files failed to be cached and the job thus failed to start.

Sample tests:
https://openqa.opensuse.org/tests/808154 (OW4)

[2018-12-04T03:35:08.0090 CET] [info] result: setup failure: Can't download openSUSE-Tumbleweed-DVD-x86_64-Snapshot20181203-Media.iso to /var/lib/openqa/cache/openqa1-opensuse/openSUSE-Tumbleweed-DVD-x86_64-Snapshot20181203-Media.iso

https://openqa.opensuse.org/tests/808312 (imegatester)

[2018-12-04T03:35:52.0783 CET] [info] result: setup failure: No workers active in the cache service.

Logs indicate like the two workers might have differentissues; I could not find any incomplete job that would have been dispatched to OW1


Related issues

Related to openQA Project - action #44105: if workercache dies, we get *tons* of incompletesResolved2018-11-21

History

#1 Updated by coolo over 4 years ago

https://openqa.opensuse.org/tests/808095 had it downloaded correctly, but was restarted by autodeploy (I assume) and https://openqa.opensuse.org/tests/808154 afterwards has download failures.

So autodeploy left the cache in limbo

#2 Updated by okurz over 4 years ago

  • Related to action #44105: if workercache dies, we get *tons* of incompletes added

#3 Updated by szarate over 4 years ago

  • Subject changed from Caching issue on new snapshots synced to o3 to Caching issue on new snapshots synced to o3 - no cache minion workers available

#4 Updated by okurz almost 4 years ago

  • Category set to Concrete Bugs

#5 Updated by okurz over 3 years ago

  • Status changed from New to Resolved
  • Assignee set to okurz

could be that #44105 really solved it. I have not seen the problem since then

Also available in: Atom PDF