action #120007
closed[alert] Many systemd alerts triggered on 06.11.22 size:S
0%
Description
They were ok again on the same day but we should investigate what happened. The problematic services were web UI host services and other services on that host like postgresql.service alert
. So maybe a problem on OSD itself.
Updated by mkittler about 2 years ago
- Status changed from New to In Progress
I've been looking at https://stats.openqa-monitor.qa.suse.de/d/webuiSyS/webui-systemd-services?editPanel=13&tab=alert&orgId=1&from=now-7d&to=now and strangely the state history doesn't show any "Alerting" entries but I suppose the mail with "[No Data]" subject corresponds to the "NO DATA" entry (yellow question mark) from Nov. 6, 2022 03:37:03. There are also more of those "NO DATA" entries and there were also mails about them (but we likely haven't looked into them at the time, at least I haven't found a reply to those I've checked).
I suppose it can be normal that there's shortly no data. We normally work around it by setting "If no data or all values are null" to "Keep last state" but for these alerts it is set to "No data". So I suggest to consistently set this to "Keep last state".
Updated by mkittler about 2 years ago
I don't think there was anything wrong with those services as it was just a no data alert.
This SR should prevent those notifications in the future: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/768
Updated by mkittler about 2 years ago
- Status changed from In Progress to Feedback
Updated by livdywan about 2 years ago
- Subject changed from [alert] Many systemd alerts triggered on 06.11.22 to [alert] Many systemd alerts triggered on 06.11.22 size:S
Updated by mkittler about 2 years ago
- Status changed from Feedback to Resolved
The SR has been merged and changes are effective in Grafana.
Updated by okurz about 2 years ago
- Due date set to 2022-11-18
- Status changed from Resolved to Feedback
Sorry, I still think https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/768/diffs#2bbeaa6f546d17e656be75c46f57589662264bea_982_981 is wrong. Please see #71098 for why we introduced those explicit "no data" alerts.
Please take a close look at my MR https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/771 to fix that again.
Updated by mkittler about 2 years ago
- Status changed from Feedback to Resolved
I thought this was only about https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/768/diffs?diff_id=139700#e3826da6725e3d4c3e0febb6a40114acc83256ec_966_965. I've merged your MR.