Project

General

Profile

Actions

action #164481

open

coordination #164466: [saga][epic] Scale up: Hyper-responsive openQA webUI

openQA Infrastructure - coordination #164469: [epic] Better tools team incident handling

[tools] Discuss benefits vs. drawbacks about applying mitigations as early as possible vs. keeping system in broken state to ease investigation. Also do industry standards best practice research size:S

Added by okurz about 1 month ago. Updated about 1 month ago.

Status:
Workable
Priority:
Normal
Assignee:
-
Target version:
Start date:
2024-07-26
Due date:
% Done:

0%

Estimated time:

Description

Motivation

See #163610-4 (HTTP Response alert Salt)

Acceptance criteria

  • AC1: The tools team knows (and the wiki explains that) which approach or methodology to follow in general

Suggestions


Related issues 2 (2 open0 closed)

Copied from openQA Infrastructure - action #164478: Monitoring of idle/busy webUI/liveview handler workersNew2024-07-26

Actions
Copied to openQA Infrastructure - action #164484: [tools] Investigation helper, e.g. commands in a bash script to collect useful logs, systemd journal, etc.New2024-07-26

Actions
Actions #1

Updated by okurz about 1 month ago

  • Copied from action #164478: Monitoring of idle/busy webUI/liveview handler workers added
Actions #2

Updated by okurz about 1 month ago

  • Copied to action #164484: [tools] Investigation helper, e.g. commands in a bash script to collect useful logs, systemd journal, etc. added
Actions #3

Updated by okurz about 1 month ago

  • Target version changed from future to Ready
Actions #4

Updated by okurz about 1 month ago

  • Subject changed from [tools] Discuss benefits vs. drawbacks about applying mitigations as early as possible vs. keeping system in broken state to ease investigation. Also do industry standards best practice research to [tools] Discuss benefits vs. drawbacks about applying mitigations as early as possible vs. keeping system in broken state to ease investigation. Also do industry standards best practice research size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #5

Updated by kraih about 1 month ago

I asked Mixtral about this:

Yes, there are industry standards and best practices that organizations can follow when deciding whether to apply mitigations as early as possible or keep a system in a broken state to ease investigation. Here are some examples:

1. ITIL (Information Technology Infrastructure Library): ITIL is a framework for managing IT services. It includes a process for incident management that recommends restoring normal service operation as quickly as possible while also minimizing the impact of the incident. This means that organizations should apply mitigations as soon as possible, while also conducting a thorough investigation to understand the root cause of the issue.

2.NIST (National Institute of Standards and Technology): NIST provides guidelines for incident response and recovery. Its guidelines recommend that organizations apply mitigations as soon as possible to reduce the impact of an incident, while also conducting a thorough investigation to understand the cause and prevent future incidents.

3. ISO/IEC 27001: This is an international standard for information security management systems (ISMS). It recommends that organizations establish processes for incident management, including the application of mitigations as soon as possible to reduce the impact of an incident.

4. COBIT (Control Objectives for Information and Related Technologies): COBIT is a framework for IT management and IT governance. It includes a process for incident management that recommends restoring normal service operation as quickly as possible while also minimizing the impact of the incident.

By following these industry standards and best practices, organizations can make informed decisions about how to balance the need to apply mitigations as early as possible with the need to conduct a thorough investigation. It's important to note that each situation is unique, and organizations should consider the specific circumstances of each incident when making a decision.```
Actions

Also available in: Atom PDF