Project

General

Profile

Actions

action #107017

closed

Random asset download (cache service) failures on openqaworker1

Added by favogt about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

Observation

On openqaworker1, several jobs incomplete due to a setup failure, e.g.:

https://openqa.opensuse.org/tests/2191909

[2022-02-16T21:47:58.251560+01:00] [debug] Found HDD_1, caching opensuse-Tumbleweed-x86_64-20220215-textmode@64bit.qcow2
[2022-02-16T21:47:59.691676+01:00] [info] Downloading opensuse-Tumbleweed-x86_64-20220215-textmode@64bit.qcow2, request #8258 sent to Cache Service
[2022-02-16T22:44:08.642013+01:00] [info] +++ worker notes +++
[2022-02-16T22:44:08.652197+01:00] [info] End time: 2022-02-16 21:44:08
[2022-02-16T22:44:08.652264+01:00] [info] Result: timeout

https://openqa.opensuse.org/tests/2193698

[2022-02-17T13:37:16.233412+01:00] [debug] Found ISO, caching openSUSE-Leap-15.3-GNOME-Live-x86_64-Build9.374-Media.iso
[2022-02-17T13:37:40.012614+01:00] [info] Downloading openSUSE-Leap-15.3-GNOME-Live-x86_64-Build9.374-Media.iso, request #9191 sent to Cache Service
[2022-02-17T14:33:00.965395+01:00] [info] +++ worker notes +++
[2022-02-17T14:33:00.965569+01:00] [info] End time: 2022-02-17 13:33:00
[2022-02-17T14:33:00.965644+01:00] [info] Result: timeout
[2022-02-17T14:33:00.974060+01:00] [info] Uploading autoinst-log.txt

Most likely this started after the worker count and cache sizes were increased after the NVMe deployment.

The minion web interface shows some errors for the GNOME-Live download:

---
args:
- 2193699
- iso
- openSUSE-Leap-15.3-GNOME-Live-x86_64-Build9.374-Media.iso
- http://openqa1-opensuse
attempts: 1
children: []
created: 2022-02-17T12:36:17Z
delayed: 2022-02-17T12:36:17Z
expires: ~
finished: 2022-02-17T12:38:45Z
id: 9188
lax: 0
notes:
  lock: openSUSE-Leap-15.3-GNOME-Live-x86_64-Build9.374-Media.iso.http://openqa1-opensuse
parents: []
priority: 0
queue: default
result: 'Job terminated unexpectedly (exit code: 11, signal: 0)'
retried: ~
retries: 0
started: 2022-02-17T12:36:43Z
state: failed
task: cache_asset
time: 2022-02-17T13:45:42Z
worker: 18
---
args:
- 2193696
- iso
- openSUSE-Leap-15.3-GNOME-Live-x86_64-Build9.374-Media.iso
- http://openqa1-opensuse
attempts: 1
children: []
created: 2022-02-17T12:36:56Z
delayed: 2022-02-17T12:36:56Z
expires: ~
finished: 2022-02-17T12:41:18Z
id: 9190
lax: 0
notes:
  lock: openSUSE-Leap-15.3-GNOME-Live-x86_64-Build9.374-Media.iso.http://openqa1-opensuse
parents: []
priority: 0
queue: default
result: 'Job terminated unexpectedly (exit code: 2, signal: 0)'
retried: ~
retries: 0
started: 2022-02-17T12:38:11Z
state: failed
task: cache_asset
time: 2022-02-17T13:45:42Z
worker: 18

Might be some miscellaneous network issues? I couldn't find any further information.


Related issues 2 (0 open2 closed)

Is duplicate of openQA Infrastructure - action #103524: OW1: performance loss size:MResolvedmkittler2021-12-06

Actions
Copied from openQA Infrastructure - action #103524: OW1: performance loss size:MResolvedmkittler2021-12-06

Actions
Actions

Also available in: Atom PDF