Project

General

Profile

Actions

action #162362

closed

2024-06-15 osd not accessible - ensure healthy filesystems size:S

Added by okurz 13 days ago. Updated 5 days ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Feature requests
Target version:
Start date:
2024-06-17
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

#162332-7 / Check the filesystems from the running OS without rebooting to check if there are errors. If there are then gracefully and announced shut down the openQA services and fix the problems from the running OS and only after ensuring cleanness trigger reboots

Acceptance criteria

  • AC1: All 5 storage devices on OSD report a clean filesystem integrity

Suggestions

  • Run for i in b c d e; do xfs_repair -m 4096 -n /dev/vd$i; done on OSD to check. On any found problems try to keep services running but in read-only mode like we did some time in the past, at least stop openqa-scheduler and such, and run xfs_repair without -n on the according storage devices during off-times with pre-announcements e.g. Thursday during the maintenance window

Rollback steps

  • DONE re-enable cron service for OSD in openqa-service in /etc/crontab

Related issues 2 (1 open1 closed)

Copied from openQA Infrastructure - action #162359: Change OSD root to more modern filesystem mount optionsNew2024-06-17

Actions
Copied to openQA Infrastructure - action #162365: OSD can fail on xfs_repair OOM conditions size:SResolvedjbaier_cz2024-06-17

Actions
Actions

Also available in: Atom PDF