Project

General

Profile

Actions

action #174316

closed

[o3][zabbix][alert] no email about zabbix alerts including storage and cpu load size:S

Added by okurz 3 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Start date:
2024-12-12
Due date:
% Done:

0%

Estimated time:

Description

Observation

From https://zabbix.nue.suse.com/zabbix.php?show=1&name=&inventory%5B0%5D%5Bfield%5D=type&inventory%5B0%5D%5Bvalue%5D=&evaltype=0&tags%5B0%5D%5Btag%5D=&tags%5B0%5D%5Boperator%5D=0&tags%5B0%5D%5Bvalue%5D=&show_tags=3&tag_name_format=0&tag_priority=&show_opdata=0&show_timeline=1&filter_name=&filter_show_counter=0&filter_custom_time=0&sort=clock&sortorder=DESC&age_state=0&show_suppressed=0&unacknowledged=0&compact_view=0&details=0&highlight_row=0&action=problem.view

it shows

2024-12-11 06:50:26                                Warning                PROBLEM                ariel.dmz-prg2.suse.org        /var/tmp: Disk space is low and might be full in 7d (used > 85%)        1d 9h 40m        No                Application: Filesystem /var/tmp
2024-12-11 06:50:23                                Warning                PROBLEM                ariel.dmz-prg2.suse.org        /: Disk space is low and might be full in 7d (used > 85%)        1d 9h 40m        No                Application: Filesystem /

which should be handled in #174313 but I don't recall that we have received any alert notification, e.g. email to o3-admins@suse.de . Ensure we get emails on alerts

Suggestions

  • Look into zabbix configuration options
  • Check if we would only get emails for critical, not "warning"
  • Crosscheck if this is a regression or if we never got emails

Related issues 4 (0 open4 closed)

Related to openQA Infrastructure (public) - action #40196: [monitoring] monitor internal port 9526, port 80, external port 443 accessibility of o3 and response times size:MResolvedokurz2018-08-23

Actions
Related to openQA Infrastructure (public) - action #174916: [alert][zabbix@suse.de] Problem: Load average is too high (per CPU load over 4 for 5m) size: SResolvedgpuliti2024-12-312025-01-25

Actions
Related to openQA Infrastructure (public) - action #175210: [o3][zabbix] reconsider e-mail notification settings size:SResolvedrobert.richardson2024-12-12

Actions
Copied from openQA Infrastructure (public) - action #174313: [o3][zabbix][alert] / and /var/tmp: "Disk space is low and might be full in 7d (used > 85%)" since 2024-12-11 06:50 size:SResolvedmkittler

Actions
Actions

Also available in: Atom PDF