Project

General

Profile

Actions

action #35299

closed

Potential corrupted file causing installation tests to fail

Added by okurz about 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2018-04-20
Due date:
% Done:

0%

Estimated time:

Description

Observation

from IRC:

[20 Apr 2018 19:43:24] <DimStar> okurz: today I see a correlation between all the DVD/*/await_install failures: they all seem to run on ow4; the same tests rerun on ow4 fail, on ow1 seem to pass
[20 Apr 2018 21:47:31] <okurz> I checked the logs on the worker but could not find anything obvious. The only job that showed some perl errors in the journal is about https://openqa.opensuse.org/tests/659929 which is actually running fine it seems and about to complete.
[20 Apr 2018 22:02:06] <DimStar> okurz: yeah.. looking at the stuff, it might be that the DVD was corrupted transferred to OW4 - it fails on installing some packages.
[20 Apr 2018 22:02:26] <DimStar> e.g. https://openqa.opensuse.org/tests/659950#step/await_install/8
[20 Apr 2018 22:03:01] <okurz> hm, you wouldn't be the first one to suggest to have more checksums to check ;)
[20 Apr 2018 22:05:50] <DimStar> looking at caches, that actually makes sense...
[20 Apr 2018 22:06:04] <DimStar> on o4: 4172888508 Apr 20 14:55 openSUSE-Tumbleweed-DVD-x86_64-Snapshot20180419-Media.iso
[20 Apr 2018 22:06:17] <DimStar> on o1: 4621074432 Apr 20 14:52 openSUSE-Tumbleweed-DVD-x86_64-Snapshot20180419-Media.iso

All the worker logs from openqaw4 from the first mention of "openSUSE-Tumbleweed-DVD-x86_64-Snapshot20180419-Media.iso" are included in the attached logfile. Please check the content.


Files


Related issues 4 (1 open3 closed)

Related to openQA Project - action #13646: Ensuring asset files integrity (was: "An error occurred during the installation" on images)Workable2016-09-09

Actions
Related to openQA Project - action #35296: Error messages on worker about "Use of uninitialized value $host in hash element at /usr/share/openqa/script/../lib/OpenQA/Worker/Common.pm line 359, <GEN298662> line 4."Rejected2018-04-20

Actions
Related to openQA Project - action #34597: Race condition causing problems with the worker cacheResolvedEDiGiacinto2018-05-11

Actions
Related to openQA Project - action #34591: Corrupt iso download by cacheResolvedokurz2018-04-09

Actions
Actions #1

Updated by okurz about 6 years ago

As a workaround DimStar deleted the file manually from the cache, let's see what this will do.

EDIT: has worked.

Actions #2

Updated by okurz about 6 years ago

  • Related to action #13646: Ensuring asset files integrity (was: "An error occurred during the installation" on images) added
Actions #3

Updated by okurz about 6 years ago

  • Related to action #35296: Error messages on worker about "Use of uninitialized value $host in hash element at /usr/share/openqa/script/../lib/OpenQA/Worker/Common.pm line 359, <GEN298662> line 4." added
Actions #4

Updated by EDiGiacinto about 6 years ago

What could help here actually is multi-chunked download ( as we did with uploads - it is checking integrity piece by piece, instead of at the end of the process )

Actions #5

Updated by coolo almost 6 years ago

  • Priority changed from Normal to High
  • Target version set to Ready
  • Difficulty set to medium
Actions #6

Updated by okurz almost 6 years ago

EDiGiacinto wrote:

What could help here actually is multi-chunked download ( as we did with uploads - it is checking integrity piece by piece, instead of at the end of the process )

I don't understand why we do not trust what is offered by existing tools to handle this properly. Are we using the wrong tools? Is Linux broken? ;)

Actions #7

Updated by EDiGiacinto almost 6 years ago

okurz wrote:

EDiGiacinto wrote:

What could help here actually is multi-chunked download ( as we did with uploads - it is checking integrity piece by piece, instead of at the end of the process )

I don't understand why we do not trust what is offered by existing tools to handle this properly. Are we using the wrong tools? Is Linux broken? ;)

Sorry - what tools are you talking about?

Actions #8

Updated by okurz almost 6 years ago

EDiGiacinto wrote:

Sorry - what tools are you talking about?

rsync, wget, curl, scp, etc.

Actions #9

Updated by szarate almost 6 years ago

  • Target version changed from Ready to Current Sprint
Actions #10

Updated by EDiGiacinto almost 6 years ago

okurz wrote:

EDiGiacinto wrote:

Sorry - what tools are you talking about?

rsync, wget, curl, scp, etc.

Interesting way of seeing things, but i'm sorry we don't share such approach - so what you are proposing exactly? JFYI we had to remove already from the codebase area that are relying on external tools, and i'm afraid you missed the reason why. Do you expect me to script openQA or actually develop it?

Actions #11

Updated by okurz almost 6 years ago

EDiGiacinto wrote:

Interesting way of seeing things, but i'm sorry we don't share such approach - so what you are proposing what exactly?

I am not proposing anything but just asking why we can't use existing tools. Reasons might be many-fold, e.g. that these tools (or perl modules) do not exist or they do not apply to our needs for whatever reasons.

JFYI we had to remove already from the codebase area that are relying on external tools, and i'm afraid you missed the reason why

Yes, seems so.

Do you expect me to script openQA or actually develop it?

I don't think this is a real open question so I guess you do not really expect an answer from me.

Actions #12

Updated by EDiGiacinto almost 6 years ago

  • Related to action #34597: Race condition causing problems with the worker cache added
Actions #13

Updated by mkittler almost 6 years ago

  • Assignee set to EDiGiacinto
Actions #14

Updated by szarate almost 6 years ago

Actions #15

Updated by szarate almost 6 years ago

  • Target version changed from Current Sprint to Ready
Actions #16

Updated by szarate almost 6 years ago

  • Target version changed from Ready to Current Sprint

Sending back to product backlog for the time being until we get some feedback if it's still happening.

Actions #17

Updated by szarate almost 6 years ago

  • Target version changed from Current Sprint to Ready

:)

Actions #18

Updated by coolo over 5 years ago

  • Target version changed from Ready to Current Sprint

One more issue related to the worker downloading assets

Actions #19

Updated by coolo over 5 years ago

  • Status changed from New to Resolved
  • Target version changed from Current Sprint to Done
Actions

Also available in: Atom PDF