Project

General

Profile

Actions

action #138515

closed

foobar host up alert size:S

Added by livdywan about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Observation

No data received for pings from worker to central host, likely host is down

Observed 11h15m3s before this notification was delivered, at 2023-10-25 00:50:00 +0200 CEST

Alert from 12.05 CEST

Dashboard: https://stats.openqa-monitor.qa.suse.de/d/WDfoobar?orgId=1
Panel: http://stats.openqa-monitor.qa.suse.de/d/WDfoobar?orgId=1&viewPanel=65105

Acceptance criteria

  • AC1: No alerts about host foobar not being up

Rollback steps

  • Unsilence alert

Suggestions

  • Can we exclude this host/alert using tags?
  • Maybe whatever was being pushed temporarily was removed afterwards?
  • Maybe hostname configured wrong on one machine?
  • Check if all machines that should be in the dashboard are in the dashboard?
Actions #1

Updated by tinita about 1 year ago

What kind of alert is this? I can't find anything in my inbox mentioning foobar in the subject...

Actions #2

Updated by livdywan about 1 year ago

  • Description updated (diff)

tinita wrote in #note-1:

What kind of alert is this? I can't find anything in my inbox mentioning foobar in the subject...

Correct. The subject would've been [FIRING:3, RESOLVED:2] host_up (openQA worker) because of alert grouping.

I'm adding the (broken) links from the email.

Actions #3

Updated by livdywan about 1 year ago

  • Subject changed from foobar host up alert to foobar host up alert size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #4

Updated by jbaier_cz about 1 year ago

  • Assignee set to jbaier_cz
Actions #5

Updated by jbaier_cz about 1 year ago

  • Status changed from Workable to Resolved

I wasn't able to find any related data in InfluxDB nor in the salt. I assume it was just a test / mistake and all data was cleaned-up. The alert is no longer silenced since yesterday without any effect. All seems fine.

Actions

Also available in: Atom PDF