Project

General

Profile

Actions

action #163610

closed

Conduct "lessons learned" with Five Why analysis for "[alert] (HTTP Response alert Salt tm0h5mf4k)"

Added by okurz about 2 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2024-07-10
Due date:
% Done:

0%

Estimated time:

Description

Motivation

In #163592 OSD was slow, sluggish or causes a lot of incomplete jobs on 2024-07-10 for multiple hours (ongoing at time of writing on 2024-07-10). We should learn what happened and find improvements for the future.

Acceptance criteria

  • AC1: A Five-Whys analysis has been conducted and results documented
  • AC2: Improvements are planned

Suggestions

  • Bring up in retro
  • Conduct "Five-Whys" analysis for the topic
  • Identify follow-up tasks in tickets
  • Organize a call to conduct the 5 whys (not as part of the retro)

Related issues 2 (0 open2 closed)

Copied from openQA Infrastructure - action #163592: [alert] (HTTP Response alert Salt tm0h5mf4k) size:MResolvedokurz2024-07-10

Actions
Copied to openQA Infrastructure - action #163775: Conduct "lessons learned" with Five Why analysis about many alerts, e.g. alerts not silenced for known issues size:SResolvedlivdywan2024-07-10

Actions
Actions

Also available in: Atom PDF