Project

General

Profile

action #65627

[kernel] Job fails to download LTP asset file

Added by pcervinka over 1 year ago. Updated 11 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
QE Kernel - QE Kernel Done
Start date:
2020-04-15
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Jobs started to fail with download error of asset file after recent changes to LTP asset handling.

Related job: https://openqa.suse.de/tests/4124669

Likely error from autoinst-log.txt:
[2020-04-15T14:35:55.0612 CEST] [info] [pid:143562] +++ setup notes +++
[2020-04-15T14:35:55.0612 CEST] [info] [pid:143562] Running on grenache-1:5 (Linux 4.12.14-lp151.28.40-default #1 SMP Fri Mar 6 13:48:15 UTC 2020 (f0f1262) ppc64le)
[2020-04-15T14:35:55.0617 CEST] [debug] [pid:143562] Found ASSET_1, caching runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz
[2020-04-15T14:35:55.0619 CEST] [info] [pid:143562] Downloading runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz, request #82339 sent to Cache Service
[2020-04-15T14:36:01.0241 CEST] [info] [pid:143562] Download of runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz processed:
[info] [#82339] Cache size of "/var/lib/openqa/cache" is 49GiB, with limit 50GiB
[info] [#82339] Downloading "runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz" from "http://openqa.suse.de/tests/4124669/asset/other/runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz"
[info] [#82339] Download of "/var/lib/openqa/cache/openqa.suse.de/runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz" failed: 404 Not Found

[2020-04-15T14:36:01.0252 CEST] [error] [pid:143562] Failed to download runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz to /var/lib/openqa/cache/openqa.suse.de/runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz
[2020-04-15T14:36:01.0742 CEST] [info] [pid:143562] +++ worker notes +++
[2020-04-15T14:36:01.0742 CEST] [info] [pid:143562] End time: 2020-04-15 12:36:01
[2020-04-15T14:36:01.0742 CEST] [info] [pid:143562] Result: setup failure
[2020-04-15T14:36:01.0746 CEST] [info] [pid:187969] Uploading autoinst-log.txt

Related issues

Related to openQA Tests - action #66007: [kernel] Add ASSET_1 variable back for PowerVM and baremetal jobsRejected2020-04-23

History

#1 Updated by pcervinka over 1 year ago

  • Description updated (diff)

#2 Updated by MDoucha over 1 year ago

This appears to be a bug in asset registration. The download URL used by cache service above always returns error 404:
http://openqa.suse.de/tests/4124669/asset/other/runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz

However, if you replace the job ID in the URL with the ID of the parent job which produced the tarball, it'll work just fine:
http://openqa.suse.de/tests/4124589/asset/other/runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz

What may be relevant to the cause of the bug:

  • The jobs are in different job groups
  • The jobs are linked together via START_DIRECTLY_AFTER_TEST (not the usual START_AFTER_TEST)
  • The tarball is a public asset, it should be accessible from both job groups
  • Restarting the failed job will result in successful asset registration and download, see https://openqa.suse.de/tests/4124669#next_previous

#3 Updated by pvorel over 1 year ago

Different job group shouldn't be an issue (we have LTP jobs in other groups, which are working).

#4 Updated by pvorel over 1 year ago

The same problem is on LTP baremetal tests => that suggest that there is some problem with START_DIRECTLY_AFTER_TEST
https://openqa.suse.de/tests/4125856#dependencies

#5 Updated by pvorel over 1 year ago

  • Project changed from SUSE QA to openQA Project

#6 Updated by pcervinka over 1 year ago

  • Status changed from New to Feedback
  • Assignee set to MDoucha
  • Target version set to 445

Martin's PR was merged https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/10047/files.
Let's result of next build (or restart of old one).

#7 Updated by mkittler over 1 year ago

I know I've mentioned in the chat that this is likely not a problem of the START_DIRECTLY_AFTER_TEST because the execution sequence of the tests is correct. However, I recently investigated a different problem regarding assets and it came to my attention that the asset registration actually takes chained dependencies into account. Looking at the code it seems that one has to apply the same logic as for regular START_AFTER_TEST dependencies in the case of START_DIRECTLY_AFTER_TEST dependencies as well. I'll create a fix.

#9 Updated by pcervinka over 1 year ago

  • Status changed from Feedback to Resolved

MDoucha thanks for the quick fix on LTP side, Public RC validation is smooth thanks to revert to previous behavior, there is no asset download failure
pvorel thanks for supporting mdoucha
mkittler thanks for the update and fix on openQA side, we will try it by adding ASSET_1 variable back when is change on osd.

I will resolve this poo and create follow up.

#11 Updated by pcervinka over 1 year ago

  • Related to action #66007: [kernel] Add ASSET_1 variable back for PowerVM and baremetal jobs added

#12 Updated by pcervinka over 1 year ago

  • Target version changed from 445 to 457

#13 Updated by pcervinka 11 months ago

  • Target version changed from 457 to QE Kernel Done

Also available in: Atom PDF