Actions
action #181184
openConduct lessons learned "Five Why" analysis for "Lessons learned for "OSD is down since 2025-04-19 due to accidental user actions removing parts of the root filesystem" size:S
Status:
Workable
Priority:
High
Assignee:
-
Category:
Organisational
Target version:
Start date:
2025-04-20
Due date:
% Done:
0%
Estimated time:
Tags:
Description
Motivation¶
See #181175. I assume https://mailman.suse.de/mlarch/SuSE/osd-admins/2025/osd-admins.2025.04/msg00343.html was the original instance triggering the manual action:
Subject: Cron <postgres@openqa> backup_dir="/var/lib/openqa/backup"; date=$(date -Idate); bf="$backup_dir/$date.dump"; test -e "$bf" || ionice -c3 nice -n19 pg_dump -Fc openqa -f "$bf"; find $backup_dir/ -mtime +7 -print0 | xargs -0 rm -v
From: "(Cron Daemon)" <postgres@openqa.oqa.prg2.suse.org>
Date: Fri, 18 Apr 2025 23:40:01 +0000 (UTC)
Background¶
Questions¶
- Why do we have such a long command as a crontab entry and not in a script?
- A1-1: ...
- => I1-1-1: ...
- ...
- A2-1: ...
- => I2-1-1: ...
- Why ...
- A1-1: ...
- => I1-1-1: ...
- Why ...
- A1-1: ...
- => I1-1-1: ...
- Why ...
- A1-1: ...
- => I1-1-1: ...
Acceptance criteria¶
- AC1: A Five-Whys analysis has been conducted and results documented
- AC2: Improvements are planned
Suggestions¶
- Conduct "Five-Whys" analysis for the topic
- Identify follow-up tasks in tickets
- Organize a call to conduct the 5 whys
Updated by okurz 3 days ago
- Copied from action #180863: Conduct lessons learned "Five Why" analysis for "Gracious handling of longer remote git clones outages" size:S added
Updated by okurz 3 days ago
- Related to action #181175: OSD is down since 2025-04-19 due to accidental user actions removing parts of the root filesystem size:M added
Updated by okurz 3 days ago
- Copied from deleted (action #180863: Conduct lessons learned "Five Why" analysis for "Gracious handling of longer remote git clones outages" size:S)
Updated by tinita about 14 hours ago
In order to not forget I want to note one question beforehand:
- Why do we have such a long command as a crontab entry and not in a script?
Updated by livdywan about 12 hours ago
- Subject changed from Conduct lessons learned "Five Why" analysis for "Lessons learned for "OSD is down since 2025-04-19 due to accidental user actions removing parts of the root filesystem" to Conduct lessons learned "Five Why" analysis for "Lessons learned for "OSD is down since 2025-04-19 due to accidental user actions removing parts of the root filesystem" size:S
- Status changed from New to Workable
Updated by livdywan about 12 hours ago
- Description updated (diff)
tinita wrote in #note-6:
In order to not forget I want to note one question beforehand:
- Why do we have such a long command as a crontab entry and not in a script?
I put it in the template, so we don't overlook it when discussing it
Updated by livdywan about 12 hours ago
- Priority changed from Normal to High
Also, this should be High so we do it soon while our memory is fresh
Actions