Actions
action #112736
closedcoordination #109846: [epic] Ensure all our database tables accomodate enough data, e.g. bigint for ids
Better alert based on 2022-06-18 incident size:M
Start date:
2022-06-20
Due date:
% Done:
0%
Estimated time:
Tags:
Description
Motivation¶
On 2022-06-18 we had issue #112718 with OSD showing significantly downgraded performance.
https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&editPanel=84&from=1655503200000&to=1655675999000&tab=alert and https://monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&from=1655516807123&to=1655574941005&viewPanel=89 showed the significant increase in number of rows per postgresql queries and increased apache time but no specific alert was raised. We could benefit from having a specific alert for that.
Acceptance criteria¶
- AC1: A provisioned alert exists for database row returned
Suggestions¶
- We have a monitoring panel for database rows returned, create a sensible alert for it
- Consider sporadic spikes which we should not alert on, e.g. grace period "every 1m for 10m" or something
- Add the alert to the provisioning data. If you have questions how to properly provision an alert from salt ask mkittler or nsinger (the README of the salt states repo has also been updated recently)
Out of scope¶
- Replicating the database content for people to play with
Actions