action #163790
Updated by mkittler 5 months ago
## Observation
I copied the corrupted config file to /etc/openqa/openqa.ini.corrupted-2024-07-11-okurz-poo163790
On backup-vm.qe.nue2.suse.org I see:
```
okurz@backup-vm:~> ls -la /home/rsnapshot/*/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 11 19:32 /home/rsnapshot/alpha.0/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 11 15:32 /home/rsnapshot/alpha.1/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 11 12:32 /home/rsnapshot/alpha.2/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 2 martchus root 13056 Jul 11 07:32 /home/rsnapshot/alpha.3/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 2 martchus root 13056 Jul 11 07:32 /home/rsnapshot/alpha.4/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 11 03:32 /home/rsnapshot/alpha.5/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 10 03:32 /home/rsnapshot/beta.0/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 9 03:32 /home/rsnapshot/beta.1/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 8 03:32 /home/rsnapshot/beta.2/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 7 03:32 /home/rsnapshot/beta.3/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 6 03:32 /home/rsnapshot/beta.4/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 5 03:32 /home/rsnapshot/beta.5/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 13056 Jul 4 03:32 /home/rsnapshot/beta.6/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 10259 Dec 17 2023 /home/rsnapshot/_delete.14764/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 10267 Jan 21 11:32 /home/rsnapshot/_delete.15309/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 1976 May 31 09:32 /home/rsnapshot/delta.0/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 10312 Apr 26 03:33 /home/rsnapshot/delta.1/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 10312 Mar 29 03:32 /home/rsnapshot/delta.2/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 10463 Jun 28 03:11 /home/rsnapshot/gamma.0/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 10463 Jun 21 03:11 /home/rsnapshot/gamma.1/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 10463 Jun 14 03:32 /home/rsnapshot/gamma.2/openqa.suse.de/etc/openqa/openqa.ini
-rw-r--r-- 1 martchus root 10463 Jun 7 03:32 /home/rsnapshot/gamma.3/openqa.suse.de/etc/openqa/openqa.ini
```
so judging from the size it seems like 2024-06-28 is the last good. I copied back that config to OSD with
```
ssh backup-vm.qe.nue2.suse.org "cat /home/rsnapshot/gamma.0/openqa.suse.de/etc/openqa/openqa.ini" | ssh osd "cat - | sudo tee /etc/openqa/openqa.ini"
```
and restart the openqa-webui service.
## Suggestions
* Enable filesystem checksums (can be enabled for ext4) and check dmesg output in case of corruption
* Ask around if there might be other options (especially if this has e.g. a big performance hit or version requirements we can't cope with)
* Read: https://ext4.wiki.kernel.org/index.php/Ext4_Metadata_Checksums
* Check for any problematic configurations in our salt states
## Out of scope
* Write a filesystem driver