action #163775
closed
Conduct "lessons learned" with Five Why analysis about many alerts, e.g. alerts not silenced for known issues size:S
Added by okurz 5 months ago.
Updated 5 months ago.
Description
Motivation¶
The past days if not weeks we have seen many alerts. Many of them not being handled according to our process and also many of them not silenced when we already know about an issue causing alert fatigue and relying only on the person on alert duty to handle them.
Acceptance criteria¶
- AC1: A Five-Whys analysis has been conducted and results documented
- AC2: Improvements are planned
Suggestions¶
- Bring up in retro
- Conduct "Five-Whys" analysis for the topic
- Identify follow-up tasks in tickets
- Organize a call to conduct the 5 whys (not as part of the retro)
- Copied from action #163610: Conduct "lessons learned" with Five Why analysis for "[alert] (HTTP Response alert Salt tm0h5mf4k)" added
- Subject changed from Conduct "lessons learned" with Five Why analysis about many alerts, e.g. alerts not silenced for known issues to Conduct "lessons learned" with Five Why analysis about many alerts, e.g. alerts not silenced for known issues size:S
- Status changed from New to Workable
- Status changed from Workable to Feedback
- Assignee set to livdywan
Let's discuss this on Wednesday 24 July afternoon. I feel like it would be good to consider #163610 in this context but I guess we can also do this one first.
livdywan wrote in #note-3:
Let's discuss this on Wednesday 24 July afternoon. I feel like it would be good to consider #163610 in this context but I guess we can also do this one first.
Happening today at 14.00 Berlin time
- Related to action #163928: [alert] Openqa HTTP Response lost on 15-07-24 size:S added
- Status changed from Feedback to Resolved
Five whys¶
- Why don't we silence HTTP response alerts?
- The HTTP response alert is very broad
- Everyone knew about this anyway
- The alert would tell us if issues have been addressed
- A notification policy override can restrict the recipient of an alert
- Suggestion: A notification policy is derrived from the alert silence
- Why don't we silence pipelines?
- People are not aware of the script
- Conveniently VPN and GitLab are not usable
- It is rather difficult to do
We only answered 2 questions. Nevertheless improvements to our documented process were made: https://progress.opensuse.org/projects/qa/wiki/Tools#Alert-handling
Also available in: Atom
PDF