Project

General

Profile

Actions

action #135260

closed

zabbix - o3 High CPU utilization (over 90% for 5m) size:M

Added by tinita about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2023-09-06
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Observation

Problem started at 12:52:09 on 2023.09.06
Problem name: High CPU utilization (over 90% for 5m)
Host: ariel.dmz-prg2.suse.org (over old-ariel)
Severity: Warning
Operational data: Current utilization: 93.12 %
Original problem ID: 553533862

https://zabbix.suse.de/history.php?action=showgraph&itemids%5B%5D=341946

Acceptance criteria

  • AC1: It has been determined if this is a problem or not.

Suggestions

  • Investigate what caused the brief spike in CPU usage
  • Consider bumping the thresholds - maybe this is fine?
  • See if it's possible to add processes of interest to the alert details
  • Request more resources for the vm

Files

o3-munin-nginx-requests.png (125 KB) o3-munin-nginx-requests.png tinita, 2023-09-06 12:58
or-munin-cpu.png (98.8 KB) or-munin-cpu.png tinita, 2023-09-06 13:06
or-munin-load.png (110 KB) or-munin-load.png tinita, 2023-09-06 13:06
Actions #1

Updated by tinita about 1 year ago

  • Description updated (diff)
Actions #2

Updated by tinita about 1 year ago

We had quite a lot of nginx requests according to munin

Updated by tinita about 1 year ago

The load itself wasn't that high, it was actually much higher yesterday.
The CPU graph shows a high value for "user" and "nice", that's untypical comparing to the historical data.

Actions #4

Updated by okurz about 1 year ago

  • Target version set to Ready
Actions #5

Updated by livdywan about 1 year ago

  • Subject changed from zabbix - o3 High CPU utilization (over 90% for 5m) to zabbix - o3 High CPU utilization (over 90% for 5m) size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #6

Updated by okurz about 1 year ago

  • Target version changed from Ready to Tools - Next
Actions #7

Updated by okurz about 1 year ago

  • Status changed from Workable to Resolved
  • Assignee set to okurz
  • Target version changed from Tools - Next to Ready

We have never disabled the alert and the problem did not reappear so is not reproducible. By now we likely do not even have logs or data to check further so just assuming this is gone for good.

Actions

Also available in: Atom PDF