Project

General

Profile

Actions

action #65627

closed

[kernel] Job fails to download LTP asset file

Added by pcervinka over 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
QE Kernel - QE Kernel Done
Start date:
2020-04-15
Due date:
% Done:

0%

Estimated time:

Description

Jobs started to fail with download error of asset file after recent changes to LTP asset handling.

Related job: https://openqa.suse.de/tests/4124669

Likely error from autoinst-log.txt:
[2020-04-15T14:35:55.0612 CEST] [info] [pid:143562] +++ setup notes +++
[2020-04-15T14:35:55.0612 CEST] [info] [pid:143562] Running on grenache-1:5 (Linux 4.12.14-lp151.28.40-default #1 SMP Fri Mar 6 13:48:15 UTC 2020 (f0f1262) ppc64le)
[2020-04-15T14:35:55.0617 CEST] [debug] [pid:143562] Found ASSET_1, caching runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz
[2020-04-15T14:35:55.0619 CEST] [info] [pid:143562] Downloading runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz, request #82339 sent to Cache Service
[2020-04-15T14:36:01.0241 CEST] [info] [pid:143562] Download of runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz processed:
[info] [#82339] Cache size of "/var/lib/openqa/cache" is 49GiB, with limit 50GiB
[info] [#82339] Downloading "runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz" from "http://openqa.suse.de/tests/4124669/asset/other/runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz"
[info] [#82339] Download of "/var/lib/openqa/cache/openqa.suse.de/runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz" failed: 404 Not Found

[2020-04-15T14:36:01.0252 CEST] [error] [pid:143562] Failed to download runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz to /var/lib/openqa/cache/openqa.suse.de/runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz
[2020-04-15T14:36:01.0742 CEST] [info] [pid:143562] +++ worker notes +++
[2020-04-15T14:36:01.0742 CEST] [info] [pid:143562] End time: 2020-04-15 12:36:01
[2020-04-15T14:36:01.0742 CEST] [info] [pid:143562] Result: setup failure
[2020-04-15T14:36:01.0746 CEST] [info] [pid:187969] Uploading autoinst-log.txt

Related issues 1 (0 open1 closed)

Related to openQA Tests - action #66007: [kernel] Add ASSET_1 variable back for PowerVM and baremetal jobsRejected2020-04-23

Actions
Actions #1

Updated by pcervinka over 4 years ago

  • Description updated (diff)
Actions #2

Updated by MDoucha over 4 years ago

This appears to be a bug in asset registration. The download URL used by cache service above always returns error 404:
http://openqa.suse.de/tests/4124669/asset/other/runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz

However, if you replace the job ID in the URL with the ID of the parent job which produced the tarball, it'll work just fine:
http://openqa.suse.de/tests/4124589/asset/other/runtest-files-sle-15-SP2-ppc64le-178.1-Online@ppc64le-spvm.tar.gz

What may be relevant to the cause of the bug:

  • The jobs are in different job groups
  • The jobs are linked together via START_DIRECTLY_AFTER_TEST (not the usual START_AFTER_TEST)
  • The tarball is a public asset, it should be accessible from both job groups
  • Restarting the failed job will result in successful asset registration and download, see https://openqa.suse.de/tests/4124669#next_previous
Actions #3

Updated by pvorel over 4 years ago

Different job group shouldn't be an issue (we have LTP jobs in other groups, which are working).

Actions #4

Updated by pvorel over 4 years ago

The same problem is on LTP baremetal tests => that suggest that there is some problem with START_DIRECTLY_AFTER_TEST
https://openqa.suse.de/tests/4125856#dependencies

Actions #5

Updated by pvorel over 4 years ago

  • Project changed from 46 to openQA Project
Actions #6

Updated by pcervinka over 4 years ago

  • Status changed from New to Feedback
  • Assignee set to MDoucha
  • Target version set to 445

Martin's PR was merged https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/10047/files.
Let's result of next build (or restart of old one).

Actions #7

Updated by mkittler over 4 years ago

I know I've mentioned in the chat that this is likely not a problem of the START_DIRECTLY_AFTER_TEST because the execution sequence of the tests is correct. However, I recently investigated a different problem regarding assets and it came to my attention that the asset registration actually takes chained dependencies into account. Looking at the code it seems that one has to apply the same logic as for regular START_AFTER_TEST dependencies in the case of START_DIRECTLY_AFTER_TEST dependencies as well. I'll create a fix.

Actions #9

Updated by pcervinka over 4 years ago

  • Status changed from Feedback to Resolved

@MDoucha thanks for the quick fix on LTP side, Public RC validation is smooth thanks to revert to previous behavior, there is no asset download failure
@pvorel thanks for supporting mdoucha
@mkittler thanks for the update and fix on openQA side, we will try it by adding ASSET_1 variable back when is change on osd.

I will resolve this poo and create follow up.

Actions #11

Updated by pcervinka over 4 years ago

  • Related to action #66007: [kernel] Add ASSET_1 variable back for PowerVM and baremetal jobs added
Actions #12

Updated by pcervinka over 4 years ago

  • Target version changed from 445 to 457
Actions #13

Updated by pcervinka about 4 years ago

  • Target version changed from 457 to QE Kernel Done
Actions

Also available in: Atom PDF