Project

General

Profile

Actions

action #163928

closed

[alert] Openqa HTTP Response lost on 15-07-24 size:S

Added by ybonatakis 12 days ago. Updated 3 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2024-07-15
Due date:
% Done:

0%

Estimated time:

Description

Observation

https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&viewPanel=78&from=1720996087861&to=1720999161474

I took a look at the logs which I attached in the ticket

I cant spot the actual problem. And the system seems to perform an update, and recovered after the restart of the services.
unresponsiveness took place from 00:42 to 01:05 (>20min)

looking at the logs I see some from telegraf

openqa telegraf[6820]: 2024-07-14T22:54:50Z E! [inputs.http] Error in plugin: [url=https://openqa.suse.de/admin/*]: Get "https://openqa.suse.de/admin/*": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

and many

Jul 15 00:54:29 openqa openqa[12024]: [debug] [pid:12024] _carry_over_candidate(14928963): ignoring job 14855612 with repeated problem                                                                                                       
Jul 15 00:54:29 openqa openqa[12024]: [debug] [pid:12024] _carry_over_candidate(14928963): checking take over from 14834954: _failure_reason=GOOD 

Files

alert_tm0h5mf4k_full (3.63 MB) alert_tm0h5mf4k_full truncated due to the upload size limit ybonatakis, 2024-07-15 08:44

Related issues 2 (0 open2 closed)

Related to openQA Infrastructure - action #163775: Conduct "lessons learned" with Five Why analysis about many alerts, e.g. alerts not silenced for known issues size:SResolvedlivdywan2024-07-10

Actions
Is duplicate of openQA Infrastructure - action #163592: [alert] (HTTP Response alert Salt tm0h5mf4k) size:MResolvedokurz2024-07-10

Actions
Actions

Also available in: Atom PDF