coordination #137714: [qe-core] proposal for new tests that covers btrfsmaintenance - openQA Tests (public) - openSUSE Project Management Tool

Actions

Copy link

coordination #137714

open

[qe-core] proposal for new tests that covers btrfsmaintenance

Added by amanzini over 1 year ago. Updated about 1 year ago.

Status:

Blocked

Priority:

Normal

Assignee:

Category:

New test

Target version:

Start date:

2023-10-12

Due date:

% Done:

100%

Estimated time:

(Total: 0.00 h)

Difficulty:

Sprint:

QE-Core: October Sprint 23 (Oct 11 - Nov 08)

Tags:

qe-core-october-sprint

Description

The scope of the new test proposed is to check functionality and system load impact of the btrfsmaintenance package. This is basically a set of scheduled tasks to perform 4 BTRFS maintenance operations:

balance
defrag
scrub
trim

I'd declare trim operation out of scope for now, since it involves specific hardware (ssd disks), but we can think about scheduling tests on specific machines.

Balance¶

This test needs at least 2 extra block devices, which can be separate (virtual) disks or partitions of the same disk.
Create a raid0 device from 1 x 5Gb disks:

 # mkfs.btrfs -d raid0 -m raid0 /dev/vdb 
 # mount /dev/vdb /mnt/raid
 # df -h | grep raid
 /dev/vdb        5.0G  3.4M  4.8G   1% /mnt/raid

check statistics:

 # btrfs filesystem df /mnt/raid 
Data, RAID0: total=512.00MiB, used=64.00KiB
Data, single: total=512.00MiB, used=256.00KiB
System, RAID0: total=8.00MiB, used=0.00B
System, single: total=32.00MiB, used=16.00KiB
Metadata, RAID0: total=256.00MiB, used=112.00KiB
Metadata, single: total=256.00MiB, used=0.00B
GlobalReserve, single: total=3.25MiB, used=0.00B
WARNING: Multiple block group profiles detected, see 'man btrfs(5)'.
WARNING:   Data: single, raid0
WARNING:   Metadata: single, raid0
WARNING:   System: single, raid0

create a big binary file

 # time dd if=/dev/random of=/mnt/raid/bigfile.bin bs=4M count=1024
1024+0 records in
1024+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 46.4441 s, 92.5 MB/s

real    0m46.445s
user    0m0.000s
sys 0m45.580s

check statistics to see disk usage

 # btrfs device usage  /mnt/raid/
/dev/vdb, ID: 1
   Device size:             5.00GiB
   Device slack:              0.00B
   Data,single:             3.94GiB
   Data,RAID0/1:          512.00MiB
   Metadata,single:       256.00MiB
   Metadata,RAID0/1:      256.00MiB
   System,single:          32.00MiB
   System,RAID0/1:          8.00MiB
   Unallocated:            21.00MiB

notice the difference between free space reported by 'df' and unallocated space reported by btrfs:

 # df -h | grep -E '(raid|Filesystem)'
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb        5.0G  4.1G  470M  90% /mnt/raid

# btrfs filesystem df  /mnt/raid/
Data, RAID0: total=512.00MiB, used=448.50MiB
Data, single: total=3.94GiB, used=3.56GiB
System, RAID0: total=8.00MiB, used=0.00B
System, single: total=32.00MiB, used=16.00KiB
Metadata, RAID0: total=256.00MiB, used=4.33MiB
Metadata, single: total=256.00MiB, used=0.00B
GlobalReserve, single: total=3.80MiB, used=0.00B
WARNING: Multiple block group profiles detected, see 'man btrfs(5)'.
WARNING:   Data: single, raid0
WARNING:   Metadata: single, raid0
WARNING:   System: single, raid0

now add a new disk device to the raid0:

 # btrfs device add -f /dev/vdc /mnt/raid/
 # btrfs device usage /mnt/raid/
/dev/vdb, ID: 1
   Device size:             5.00GiB
   Device slack:              0.00B
   Data,single:             3.94GiB
   Data,RAID0/1:          512.00MiB
   Metadata,single:       256.00MiB
   Metadata,RAID0/1:      256.00MiB
   System,single:          32.00MiB
   System,RAID0/1:          8.00MiB
   Unallocated:            21.00MiB

/dev/vdc, ID: 2
   Device size:             5.00GiB
   Device slack:              0.00B
   Unallocated:             5.00GiB

the aggregate filesystem now is larger, but the available space does not reflect the new size, because the filesystem is *unbalanced * ; we need to trigger balance of the filesytem. The next operation is I/O intensive :

# df -h | grep -E '(raid|Filesystem)'
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb         10G  4.1G  490M  90% /mnt/raid

# btrfs balance start --full-balance -v /mnt/raid

now free space is available:

# df -h | grep raid
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb         10G  3.6G  6.3G  37% /mnt/raid

Here we can either force the balance, or just "start" it and let run in background, using the script in the maintenance package called /usr/share/btrfsmaintenance/btrfs-balance.sh

Scrub¶

Scrub is a pass over all filesystem data and metadata and verifying the checksums. If a valid copy is available (replicated block group profiles) then the damaged one is repaired. All copies of the replicated profiles are validated.

to test btrfs-scrub the idea is to deliberately create a defective filesystem.

this can be accomplished in many ways, for example

Manually 'Corrupt' Data Blocks
Deleting important Metadata (using dd or similar)
Unmount the Filesystem Abruptly
leverage kernel's fault injection features https://www.kernel.org/doc/Documentation/fault-injection/fault-injection.txt

let's start by creating a RAID1 filesystem:

 # mkfs.btrfs -f -d raid1 -m raid1 /dev/vdb /dev/vdc

 # mount /dev/vdb /mnt/raid

 # df -h | grep -E '(raid|Filesystem)'
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdb        5.0G  3.7M  4.8G   1% /mnt/raid

 # echo "this is a test" > /mnt/raid/myfile.txt
 # umount /mnt/raid

now we need to find out at which offset in the block device is the string we just wrote

 # grep --only-matching --byte-offset --max-count=1 --text "this is a test" /dev/vdc
9616961:this is a test

and now we can overwrite data only in one of the two disks, by writing directly in the block device :

 # echo "THIS IS A MESS" | dd of=/dev/vdc bs=1 conv=notrunc seek=9616961 
15+0 records in
15+0 records out
15 bytes copied, 0.00108664 s, 13.8 kB/s

remount RAID device:

 # mount /dev/vdc /mnt/raid

create many small random files, just to increase data and metadata usage.

 #! /bin/bash
for d in {1..50}; do
  mkdir dir$( printf %03d "$d") && pushd $_
  for n in {0..999}; do
     dd if=/dev/zero of=file$( printf %03d "$n" ).bin bs=10 count=$(( RANDOM%1024 )) >& /dev/null
  done
  popd 
done

start scrub:

 # btrfs scrub start -B -f /mnt/raid/

The scrubbing status is recorded in /var/lib/btrfs/ in textual files named scrub.status.UUID for a filesystem identified by the given UUID. (Progress state is communicated through a named pipe in file scrub.progress.UUID in the same directory.) The status file is updated every 5 seconds. A resumed scrub will continue from the last saved position.

in the dmesg we can find some details:

[ 9537.259295] BTRFS info (device vdb): scrub: started on devid 2
[ 9537.259631] BTRFS info (device vdb): scrub: started on devid 1
[ 9537.260543] BTRFS info (device vdb): scrub: finished on devid 1 with status: 0
[ 9537.260563] BTRFS warning (device vdb): checksum error at logical 30572544 on dev /dev/vdc, physical 9601024: metadata leaf (level 0) in tree 5
[ 9537.260565] BTRFS warning (device vdb): checksum error at logical 30572544 on dev /dev/vdc, physical 9601024: metadata leaf (level 0) in tree 5
[ 9537.260566] BTRFS error (device vdb): bdev /dev/vdc errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
[ 9537.260954] BTRFS error (device vdb): fixed up error at logical 30572544 on dev /dev/vdc
[ 9537.261073] BTRFS info (device vdb): scrub: finished on devid 2 with status: 0

 # ls -l /var/lib/btrfs/scrub.status.*
-rw------- 1 root root 723 Oct  7 11:28 /var/lib/btrfs/scrub.status.71589e19-f992-464b-9113-5d5b8a19480d
-rw------- 1 root root 763 Oct  9 15:10 /var/lib/btrfs/scrub.status.ded4e395-487a-4ad1-aee3-a077ea110fb6

 # cat /var/lib/btrfs/scrub.status.ded4e395-487a-4ad1-aee3-a077ea110fb6 
scrub status:1
ded4e395-487a-4ad1-aee3-a077ea110fb6:1|data_extents_scrubbed:5|tree_extents_scrubbed:8|data_bytes_scrubbed:327680|tree_bytes_scrubbed:131072|read_errors:0|csum_errors:0|verify_errors:0|no_csum:80|csum_discards:0|super_errors:0|malloc_errors:0|uncorrectable_errors:0|corrected_errors:0|last_physical:1372585984|t_start:1696857002|t_resumed:0|duration:0|canceled:0|finished:1

Defrag¶

to test btrfs-defrag we need to create a fragmented filesytem.
This is hard because Btrfs is designed to handle data fragmentation automatically, aiming for optimal performance and data organization.

what we can do is :

Copy and delete random data
appending data to existing files
Create snapshots

this is a long test to run and requires a lot of crunching time. One option is to prepare offline a "fragmented" QCOW2 disk and give it to the test, let's discuss if feasible.

Subtasks 1 (0 open — 1 closed)