Project

General

Profile

Actions

action #107875

closed

[alert][osd] Apache Response Time alert size:M

Added by mkittler almost 3 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Start date:
2022-03-04
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Observation

We've got the alert again on March 3, 2022 09:00:40:

[Alerting] Apache Response Time alert
The apache response time exceeded the alert threshold. * Check the load of the web UI host * Consider restarting the openQA web UI service and/or apache Also see https://progress.opensuse.org/issues/73633

Metric name
Value
Min
18733128.83

Relevant panel: https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=alert&viewPanel=84


Tina wrote in chat

if anyone was wondering about the short high load on osd, I fetched /api/v1/jobs and it took 10 minutes

but that was already on Wednesday so it shouldn't have been caused this.

Further data points
- High CPU likely didn't affect scheduling, or we should've had other reports of it
- High CPU wouldn't cause a spike in failures in jobs?

Suggestions

  • The apache log parsing seems to be quite heavy. Can we reduce the amount of data parsed by telegraf
  • Reduce interval we take new data points in telegraf
  • Extend alerting measurement period from 5m to 30m (or higher) to smooth out gaps

Files


Related issues 5 (0 open5 closed)

Related to openQA Infrastructure (public) - action #107257: [alert][osd] Apache Response Time alert size:MResolvedokurz2022-02-22

Actions
Related to openQA Infrastructure (public) - action #96807: Web UI is slow and Apache Response Time alert got triggeredResolvedokurz2021-08-122021-10-01

Actions
Related to openQA Project (public) - action #94111: Optimize /api/v1/jobsResolvedtinita2021-06-16

Actions
Related to openQA Infrastructure (public) - action #128789: [alert] Apache Response Time alert size:MResolvednicksinger2023-04-01

Actions
Copied to openQA Project (public) - coordination #108209: [epic] Reduce load on OSDResolvedokurz2023-04-01

Actions
Actions

Also available in: Atom PDF