Project

General

Profile

Actions

action #164222

open

Client error messages look like http response codes - we should phrase them better - was: asset download fails with "521 Connect timeout"

Added by okurz 8 days ago. Updated 5 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Regressions/Crashes
Target version:
Start date:
2024-07-19
Due date:
2024-08-03 (Due in 7 days)
% Done:

0%

Estimated time:

Description

Observation

https://openqa.suse.de/tests/14969481#step/GRU/1 shows

preparation failed: Downloading "http://automotive-3.arch.suse.de/gitlab/chuller/carwos-chuller_carwos1.1-2850918.qcow2" failed with: Download of "/var/lib/openqa/share/factory/hdd/carwos-chuller_carwos1.1-2850918.qcow2" failed: 521 Connect timeout 

and a lot of others in the same job group. Apparently this happened because chuller wants to "resurrect" the carwos tests and struggles to get the communication up and running again. This can happen but we should ensure that we can read error messages correctly. Here the "521 Connect timeout" looks like an http response from "the server" but really seems to be a response from the client. Maybe mojo requests produce errors looking similar to http response codes. We should ensure these are properly labeled so that people cannot confuse them with responses from the server.

Suggestions

  • Find the code responsible for downloading this (is it test code? Minion code? Something else?) and improve its log message

Rollback steps

Actions #1

Updated by livdywan 8 days ago

Might be related to #163766? Although this is 521 Connect timeout versus Download error 598.

Actions #2

Updated by nicksinger 8 days ago

  • Status changed from New to In Progress
  • Assignee set to nicksinger
Actions #3

Updated by livdywan 8 days ago

It seems like this job has never worked? https://openqa.suse.de/tests/14960353#step/GRU/1 is the oldest one I see in Next/Previous and it has the same issue 🤔

Actions #4

Updated by nicksinger 8 days ago

  • Assignee deleted (nicksinger)
  • Priority changed from Urgent to Normal

Asked chuller about the new scheduled jobs: https://suse.slack.com/archives/C02CANHLANP/p1721384884116119 - it is expected that they are running. Apparently also just scheduling a single job is not possible.
Created a silence for now to mitigate the priority: https://monitor.qa.suse.de/alerting/silence/1356ce3d-d7ca-4750-b478-710cb0366d85/edit?alertmanager=grafana

This might turn out as a openQA feature request because I'm not certain if a job which fails to download necessary assets - but not as preparation but as test-module - should end up incomplete.
I unassigne myself now because I don't know how I could continue.

Actions #5

Updated by nicksinger 8 days ago

  • Description updated (diff)
  • Assignee set to nicksinger
Actions #6

Updated by openqa_review 8 days ago

  • Due date set to 2024-08-03

Setting due date based on mean cycle time of SUSE QE Tools

Actions #7

Updated by nicksinger 5 days ago

  • Subject changed from asset download fails with "521 Connect timeout" to Client error messages look like http response codes - we should phrase them better - was: asset download fails with "521 Connect timeout"
  • Description updated (diff)
  • Status changed from In Progress to New
  • Assignee deleted (nicksinger)
Actions

Also available in: Atom PDF