Bind mounts of fixed assets is racy
Whoever, whatever - but osd rebooted at sunday 3am and that had all fixed assets missing. The reason was that the bind mount for /assets/hdd/fixed happened before /assets was actually mounted, leaving an empty fixed folder afterwards. I fixed this manually now, but to fix this for future reboots, I found https://unix.stackexchange.com/a/324723 to be leading to the proper fix in fstab
#2 Updated by okurz about 1 year ago
- Status changed from In Progress to Feedback
Whoever, whatever - but osd rebooted at sunday 3am
yes, that is done automatically by "rebootmgr" whenever there are package upgrades that ask for a reboot, e.g. kernel upgrade. It also helps us to find these problems and not wait for the poor person that reboots the server only once every 3 years ;)
with a fixed /etc/fstab
To my understanding we only need dependencies for the directories in which we want to bind mount to because any mount point specified below an existing one already depends on all parent directory levels.
Also testing on lord.arch
EDIT: Test on lord.arch successful with:
UUID=5140113d-99e0-4244-beac-8787e56c0946 /var/lib/openqa btrfs subvol=@/var/lib/openqa 0 0 /abuild/pool /var/lib/openqa/pool none x-systemd.requires=/var/lib/openqa,x-systemd.automount,bind 0 0
I am wondering though if the additional dependencies are really necessary because currently on osd (without the change in /etc/fstab):
$ systemctl list-dependencies var-lib-openqa-share-factory-hdd-fixed.mount var-lib-openqa-share-factory-hdd-fixed.mount ● ├─-.mount ● ├─space\x2dslow.mount ● ├─system.slice ● ├─var-lib-openqa-share.mount ● └─var-lib-openqa.mount
so both the source and target parent mount points are there, what should be missing?
- Status changed from Feedback to Resolved
Apparently was not enough, see https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/399
merged and deployed manually as gitlab CI runners are currently not picked up default (EngInfra was informed). After
git pull --rebase as root@osd in /srv/salt I called
salt --no-color -l error -C 'G@roles:webui' state.apply, triggered reboot to check as currently only 5 jobs are running.
Verified working fine and dependencies look ok, e.g. when calling
systemctl cat var-lib-openqa-share-factory-hdd-fixed.automount and others.