Actions
action #163775
closedConduct "lessons learned" with Five Why analysis about many alerts, e.g. alerts not silenced for known issues size:S
Status:
Resolved
Priority:
High
Assignee:
Category:
Organisational
Target version:
Start date:
2024-07-10
Due date:
% Done:
0%
Estimated time:
Tags:
Description
Motivation¶
The past days if not weeks we have seen many alerts. Many of them not being handled according to our process and also many of them not silenced when we already know about an issue causing alert fatigue and relying only on the person on alert duty to handle them.
Acceptance criteria¶
- AC1: A Five-Whys analysis has been conducted and results documented
- AC2: Improvements are planned
Suggestions¶
- Bring up in retro
- Conduct "Five-Whys" analysis for the topic
- Identify follow-up tasks in tickets
- Organize a call to conduct the 5 whys (not as part of the retro)
Updated by okurz 5 months ago
- Copied from action #163610: Conduct "lessons learned" with Five Why analysis for "[alert] (HTTP Response alert Salt tm0h5mf4k)" added
Updated by okurz 5 months ago
- Subject changed from Conduct "lessons learned" with Five Why analysis about many alerts, e.g. alerts not silenced for known issues to Conduct "lessons learned" with Five Why analysis about many alerts, e.g. alerts not silenced for known issues size:S
- Status changed from New to Workable
Updated by livdywan 5 months ago
- Related to action #163928: [alert] Openqa HTTP Response lost on 15-07-24 size:S added
Updated by livdywan 5 months ago
- Status changed from Feedback to Resolved
Five whys¶
- Why don't we silence HTTP response alerts?
- The HTTP response alert is very broad
- Everyone knew about this anyway
- The alert would tell us if issues have been addressed
- A notification policy override can restrict the recipient of an alert
- Suggestion: A notification policy is derrived from the alert silence
- Why don't we silence pipelines?
- People are not aware of the script
- Conveniently VPN and GitLab are not usable
- It is rather difficult to do
We only answered 2 questions. Nevertheless improvements to our documented process were made: https://progress.opensuse.org/projects/qa/wiki/Tools#Alert-handling
Actions