coordination #62456
[epic] test incompletes after failing in GRU download task on "Inactivity timeout" with no logs
100%
Description
Observation¶
openQA test in scenario obs-Unstable-Appliance-x86_64-obs_appliance@64bit-4G incompletes after failing in
GRU on
Gru job failed Reason: asset download: download of http://download.opensuse.org/repositories/OBS:/Server:/Unstable/images/obs-server.x86_64-2.10.51-qcow2-Build2.438.qcow2 to /var/lib/openqa/share/factory/hdd/obs-server.x86_64-2.10.51-qcow2-Build2.438.qcow2 failed: connection error: Inactivity timeout at /usr/share/openqa/script/../lib/OpenQA/Task/Asset/Download.pm line 74.
ending the complete job in "incomplete".
Reproducible¶
Hard, seems to be related to temporary network problems.
Acceptance criteria¶
- AC1: GRU download retries automatically on temporary network problems
- AC2: The way we provide feedback to users is comparable for GRU download failures as well as other incompletes with known reasons
Open points¶
I realized the quite different approach for user feedback on GRU download jobs vs. other incompletes with mkittler already on 2020-01-21. Retry is just one part of it. Maybe the way how GRU download jobs provide details but still incomplete is the way to go?
Further details¶
Always latest result in this scenario: latest
Subtasks
History
#1
Updated by kraih about 1 year ago
Might be interesting to combine the cache service and gru download code into a shared module, and get retry support that way. Both do pretty much the same thing. It's a bit of upfront work, but later on we'd benefit from being able to test the various special cases for downloads in one place. And bugfixes would automatically apply to both.
#2
Updated by okurz about 1 year ago
- Description updated (diff)
ok, so I see AC1 covered and solved in #62459 , the UX part is still open.
#6
Updated by szarate 6 months ago
See for the reason of tracker change: http://mailman.suse.de/mailman/private/qa-sle/2020-October/002722.html