action #76876
Updated by livdywan over 3 years ago
## Observation As max already reported repeatably that he can't extract info from our automated alerts from grafana I think it is time to find a better solution. Just setting infra as receiver for grafana alerts results in mails like this: ``` "Dear Colleague, Thank you for your report of: "[No Data] [openqa] openqaworker-arm-3 online (long-time) alert" assigned reference number: "178873" Someone from the designate team will contact you about your request as soon as we can. If you have additional comments or questions, you can follow up to the ticket here at : https://infra.nue.suse.com/Ticket/Display.html?id=178873 Regards, The Engineering Infrastructure Team" infra@suse.de ------------------------------------------------------------------------- The original message: ------------------------------------------------------------------------- [IMAGE] [IMAGE] [IMAGE] [IMAGE] [No Data] [openqa] openqaworker-arm-3 online (long-time) alert [No Data] [openqa] openqaworker-arm-3 online (long-time) alert [No Data] [openqa] openqaworker-arm-3 online (long-time) alert The IPMI management interface for this machine is inaccessible (again). The The IPMI management interface for this machine is inaccessible (again). The Metric name Metric name Value View your Alert rule (http://stats.openqa-monitor.qa.suse.de/d/1bNU0StZz/automatic-actions?fullscreen&edit&tab=alert&panelId=7&orgId=1) View your Alert rule (http://stats.openqa-monitor.qa.suse.de/d/1bNU0StZz/automatic-actions?fullscreen&edit&tab=alert&panelId=7&orgId=1) View your Alert rule (http://stats.openqa-m onitor.qa.suse.de/d/1bNU0StZz/automatic-actions?fullscreen&edit&tab=alert&panelId=7&orgId=1) Go to the Alerts page (http://stats.openqa-monitor.qa.suse.de/alerting) Go to the Alerts page (http://stats.openqa-monitor.qa.suse.de/alerting) Sent by Grafana v6.4.3 (http://stats.openqa-monitor.qa.suse.de/) Sent by Grafana v6.4.3 (http://stats.openqa-monitor.qa.suse.de/) machine itself is also not reachable over ping. Suggested action: Reset the machine itself is also not reachable over ping. Suggested action: Reset the © 2016 Grafana and raintank © 2016 Grafana and raintank The IPMI management interface for this machine is inaccessible (again). The machine including the management interface. Similar issues were handled in machine including the management interface. Similar issues were handled in Value Go to the Alerts page (http://stats.openqa-monitor.qa.suse.de/alerting) [No Data] [openqa] openqaworker-arm-3 online (long-time) alert machine itself is also not reachable over ping. Suggested action: Reset the https://infra.nue.suse.com/SelfService/Update.html?id=174650 and https://infra.nue.suse.com/SelfService/Update.html?id=174650 and machine including the management interface. Similar issues were handled in https://infra.nue.suse.com/SelfService/Display.html?id=166330 and https://infra.nue.suse.com/SelfService/Display.html?id=166330 and The IPMI management interface for this machine is inaccessible (again). The https://infra.nue.suse.com/SelfService/Update.html?id=174650 and https://infra.nue.suse.com/SelfService/Display.html?id=164419 and https://infra.nue.suse.com/SelfService/Display.html?id=164419 and machine itself is also not reachable over ping. Suggested action: Reset the https://infra.nue.suse.com/SelfService/Display.html?id=166330 and https://infra.nue.suse.com/SelfService/Display.html?id=153124 for the same https://infra.nue.suse.com/SelfService/Display.html?id=153124 for the same machine including the management interface. Similar issues were handled in https://infra.nue.suse.com/SelfService/Display.html?id=164419 and machine machine https://infra.nue.suse.com/SelfService/Update.html?id=174650 and https://infra.nue.suse.com/SelfService/Display.html?id=153124 for the same https://infra.nue.suse.com/SelfService/Display.html?id=166330 and machine https://infra.nue.suse.com/SelfService/Display.html?id=164419 and https://infra.nue.suse.com/SelfService/Display.html?id=153124 for the same Metric name machine Value Metric name View your Alert rule (http://stats.openqa-monitor.qa.suse.de/d/1bNU0StZz/automatic-actions?fullscreen&edit&tab=alert&panelId=7&orgId=1) Value Go to the Alerts page (http://stats.openqa-monitor.qa.suse.de/alerting) View your Alert rule (http://stats.openqa-monitor.qa.suse.de/d/1bNU0StZz/automatic-actions?fullscreen&edit&tab=alert&panelId=7&orgId=1) Go to the Alerts page (http://stats.openqa-monitor.qa.suse.de/alerting) Sent by Grafana v6.4.3 (http://stats.openqa-monitor.qa.suse.de/) © 2016 Grafana and raintank ``` ## Suggestions Ideas how to fix this: * Maybe the mail template can be changed? (best to text only) * We can use a similar approach like we have for automated_actions already: Let a custom gitlab-job create the infra ticket * We can implement our own piece of software which talks the grafana webhook api