action #123825
closedEnsure proper o3 monitoring after shutdown of thruk/icinga by SUSE-IT Eng-Infra
0%
Description
Motivation¶
See email from Eng-Infra to devel@suse.de:
We are going to shutdown the old monitoring system based on Icinga.
The Icinga monitoring system started in 2013 as replacement of the old Nagios monitoring system.
Over the years it has been evolving and growing. We implemented several new technologies and
approaches there. For example check_mk , Thruk or SUSE HA.
But now the time went forward and monitoring tool also evolved. The opensource community came with new
tools. Its time to say good bye Icinga and thanks for all mails.
The Icinga monitoring system running on the portal thruk.suse.de will be replaced new tool Zabbix
running on machine zabbix.suse.de.
The new tool monitors infrastructure, servers and services in SUSE IT infra team responsibility.
We as team are open to offer this service to other SUSE teams.
The service will be shutdown on 1.2.2023.
With this we should ensure that monitoring&alerting in particular of o3 still works. Likely we need to make sure the according configuration is added to the new instance.
Acceptance criteria¶
- AC1: Monitoring data regarding o3 is available for SUSE QE Tools team members
- AC2: SUSE QE tools team members are alerted in case of critical problems regarding o3
- AC3: Our documentation mentions/links the new system
Suggestions¶
- Clarify about the current situation, if the o3 monitoring was migrated, etc.
- Get access to the new system
- Test access for multiple team members
- Add missing configuration as needed
- Test alerting
- Mention/link the new system on our documentation, e.g. progress.opensuse.org/projects/openqav3/wiki/ and progress.opensuse.org/projects/qa/wiki/tools#Common-tasks-for-team-members