action #35299
closed
Potential corrupted file causing installation tests to fail
Added by okurz over 6 years ago.
Updated about 6 years ago.
Category:
Regressions/Crashes
Description
Observation¶
from IRC:
[20 Apr 2018 19:43:24] <DimStar> okurz: today I see a correlation between all the DVD/*/await_install failures: they all seem to run on ow4; the same tests rerun on ow4 fail, on ow1 seem to pass
[20 Apr 2018 21:47:31] <okurz> I checked the logs on the worker but could not find anything obvious. The only job that showed some perl errors in the journal is about https://openqa.opensuse.org/tests/659929 which is actually running fine it seems and about to complete.
[20 Apr 2018 22:02:06] <DimStar> okurz: yeah.. looking at the stuff, it might be that the DVD was corrupted transferred to OW4 - it fails on installing some packages.
[20 Apr 2018 22:02:26] <DimStar> e.g. https://openqa.opensuse.org/tests/659950#step/await_install/8
[20 Apr 2018 22:03:01] <okurz> hm, you wouldn't be the first one to suggest to have more checksums to check ;)
[20 Apr 2018 22:05:50] <DimStar> looking at caches, that actually makes sense...
[20 Apr 2018 22:06:04] <DimStar> on o4: 4172888508 Apr 20 14:55 openSUSE-Tumbleweed-DVD-x86_64-Snapshot20180419-Media.iso
[20 Apr 2018 22:06:17] <DimStar> on o1: 4621074432 Apr 20 14:52 openSUSE-Tumbleweed-DVD-x86_64-Snapshot20180419-Media.iso
All the worker logs from openqaw4 from the first mention of "openSUSE-Tumbleweed-DVD-x86_64-Snapshot20180419-Media.iso" are included in the attached logfile. Please check the content.
Files
As a workaround DimStar deleted the file manually from the cache, let's see what this will do.
EDIT: has worked.
- Related to action #13646: Ensuring asset files integrity (was: "An error occurred during the installation" on images) added
- Related to action #35296: Error messages on worker about "Use of uninitialized value $host in hash element at /usr/share/openqa/script/../lib/OpenQA/Worker/Common.pm line 359, <GEN298662> line 4." added
What could help here actually is multi-chunked download ( as we did with uploads - it is checking integrity piece by piece, instead of at the end of the process )
- Priority changed from Normal to High
- Target version set to Ready
- Difficulty set to medium
EDiGiacinto wrote:
What could help here actually is multi-chunked download ( as we did with uploads - it is checking integrity piece by piece, instead of at the end of the process )
I don't understand why we do not trust what is offered by existing tools to handle this properly. Are we using the wrong tools? Is Linux broken? ;)
okurz wrote:
EDiGiacinto wrote:
What could help here actually is multi-chunked download ( as we did with uploads - it is checking integrity piece by piece, instead of at the end of the process )
I don't understand why we do not trust what is offered by existing tools to handle this properly. Are we using the wrong tools? Is Linux broken? ;)
Sorry - what tools are you talking about?
EDiGiacinto wrote:
Sorry - what tools are you talking about?
rsync, wget, curl, scp, etc.
- Target version changed from Ready to Current Sprint
okurz wrote:
EDiGiacinto wrote:
Sorry - what tools are you talking about?
rsync, wget, curl, scp, etc.
Interesting way of seeing things, but i'm sorry we don't share such approach - so what you are proposing exactly? JFYI we had to remove already from the codebase area that are relying on external tools, and i'm afraid you missed the reason why. Do you expect me to script openQA or actually develop it?
EDiGiacinto wrote:
Interesting way of seeing things, but i'm sorry we don't share such approach - so what you are proposing what exactly?
I am not proposing anything but just asking why we can't use existing tools. Reasons might be many-fold, e.g. that these tools (or perl modules) do not exist or they do not apply to our needs for whatever reasons.
JFYI we had to remove already from the codebase area that are relying on external tools, and i'm afraid you missed the reason why
Yes, seems so.
Do you expect me to script openQA or actually develop it?
I don't think this is a real open question so I guess you do not really expect an answer from me.
- Related to action #34597: Race condition causing problems with the worker cache added
- Assignee set to EDiGiacinto
- Target version changed from Current Sprint to Ready
- Target version changed from Ready to Current Sprint
Sending back to product backlog for the time being until we get some feedback if it's still happening.
- Target version changed from Current Sprint to Ready
- Target version changed from Ready to Current Sprint
One more issue related to the worker downloading assets
- Status changed from New to Resolved
- Target version changed from Current Sprint to Done
Also available in: Atom
PDF