Project

General

Profile

Actions

action #163766

closed

Scripts CI | Failed pipeline for master (asset failure: Failed to download ....qcow2) size:S

Added by tinita 5 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Start date:
2024-07-11
Due date:
% Done:

0%

Estimated time:

Description

Observation

We are seeing this error and a couple of incompletes today:

Date: Thu, 11 Jul 2024 11:51:31 +0000
From: "GitLab@SUSE" <gitlab@suse.de>
To: osd-admins@suse.de
Subject: Scripts CI | Failed pipeline for master | 33a115c3

https://gitlab.suse.de/openqa/scripts-ci/-/pipelines/1207977
https://stats.openqa-monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1&from=1720656188837&to=1720699027331&viewPanel=16

https://openqa.suse.de/tests/14894947

Result: incomplete, finished 10 minutes ago (ran for 36:46 minutes)
Reason: asset failure: Failed to download SLES-15-SP5-x86_64-mru-install-minimal-with-addons-Build20240710-1-Server-DVD-Updates-64bit.qcow2 to /var/lib/openqa/cache/openqa.suse.de/SLES-15-SP5-x86_64-mru-install-minimal-with-addons-Build20240710-1-Server-DVD-Updates-64bit.qcow2

The detailed error message is on e.g. https://openqa.suse.de/tests/14894946:

[info] [#21729] Download error 598, waiting 5 seconds for next try (1 remaining)

(see https://http.dev/598)

Suggestions

  • Look for accordingly affected jobs, e.g. using openqa-label-known-issues or look into the database and retrigger accordingly
  • Due to the timely coincidence this is very related to #163592 but assets should be served by NGINX directly (so the unresponsive Mojo web app should not have an impact)
  • Verify that those asset downloads work independently from the web app (e.g. by stopping the web app shortly and try to download the asset)
    • Maybe the web app is still involved for some redirection? That would be fine but of course means it is related to #163592.
    • Can we split that completely?
  • Check how this timeout is handled by NGINX

Rollback actions


Related issues 1 (1 open0 closed)

Related to openQA Infrastructure (public) - action #164400: Feature: Continue failed downloads without starting from the beginning in cacheserviceNew2024-07-24

Actions
Actions

Also available in: Atom PDF