action #138515
closedfoobar host up alert size:S
0%
Description
Observation¶
No data received for pings from worker to central host, likely host is down
Observed 11h15m3s before this notification was delivered, at 2023-10-25 00:50:00 +0200 CEST
Alert from 12.05 CEST
Dashboard: https://stats.openqa-monitor.qa.suse.de/d/WDfoobar?orgId=1
Panel: http://stats.openqa-monitor.qa.suse.de/d/WDfoobar?orgId=1&viewPanel=65105
Acceptance criteria¶
- AC1: No alerts about host foobar not being up
Rollback steps¶
- Unsilence alert
Suggestions¶
- Can we exclude this host/alert using tags?
- Maybe whatever was being pushed temporarily was removed afterwards?
- Maybe hostname configured wrong on one machine?
- Check if all machines that should be in the dashboard are in the dashboard?
Updated by tinita about 1 year ago
What kind of alert is this? I can't find anything in my inbox mentioning foobar in the subject...
Updated by livdywan about 1 year ago
- Description updated (diff)
tinita wrote in #note-1:
What kind of alert is this? I can't find anything in my inbox mentioning foobar in the subject...
Correct. The subject would've been [FIRING:3, RESOLVED:2] host_up (openQA worker) because of alert grouping.
I'm adding the (broken) links from the email.
Updated by livdywan about 1 year ago
- Subject changed from foobar host up alert to foobar host up alert size:S
- Description updated (diff)
- Status changed from New to Workable
Updated by jbaier_cz about 1 year ago
- Status changed from Workable to Resolved
I wasn't able to find any related data in InfluxDB nor in the salt. I assume it was just a test / mistake and all data was cleaned-up. The alert is no longer silenced since yesterday without any effect. All seems fine.