Project

General

Profile

Actions

action #162719

closed

coordination #162716: [epic] Better use of storage on OSD workers

Ensure w40 has more space for worker pool directories size:S

Added by okurz 26 days ago. Updated about 5 hours ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Feature requests
Target version:
Start date:
2024-06-21
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

w40 ran out of space in /var/lib/openqa despite having another partition with multiple TB free space. We should reconsider the choices we made for setting up OSD PRG2 workers.

# lsblk 
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
nvme0n1     259:1    0   5.8T  0 disk  
├─nvme0n1p1 259:2    0   512M  0 part  /boot/efi
├─nvme0n1p2 259:3    0   5.8T  0 part  /var
…
│                                      /
└─nvme0n1p3 259:4    0     1G  0 part  [SWAP]
nvme2n1     259:5    0 476.9G  0 disk  
└─md127       9:127  0 476.8G  0 raid0 /var/lib/openqa
# hdparm -tT /dev/nvme?n1

/dev/nvme0n1:
 Timing cached reads:   30178 MB in  1.99 seconds = 15202.23 MB/sec
 Timing buffered disk reads: 6360 MB in  3.00 seconds = 2120.00 MB/sec

/dev/nvme2n1:
 Timing cached reads:   33204 MB in  1.98 seconds = 16739.11 MB/sec
 Timing buffered disk reads: 8478 MB in  3.00 seconds = 2825.74 MB/sec

nvme2n1 seems to be 30% faster but is more limited in space.

Acceptance criteria

  • AC1: w40 has significantly more space than 500G for pool+cache combined

Suggestions

Out of scope

Rollback steps


Related issues 2 (2 open0 closed)

Related to openQA Infrastructure - action #162602: [FIRING:1] worker40 (worker40: CPU load alert openQA worker40 salt cpu_load_alert_worker40 worker) size:SBlockedokurz2024-06-20

Actions
Copied to openQA Infrastructure - action #162725: After w40 reconsider storage use for other OSD workersNew2024-06-21

Actions
Actions

Also available in: Atom PDF