Bind mounts of fixed assets is racy
Whoever, whatever - but osd rebooted at sunday 3am and that had all fixed assets missing. The reason was that the bind mount for /assets/hdd/fixed happened before /assets was actually mounted, leaving an empty fixed folder afterwards. I fixed this manually now, but to fix this for future reboots, I found https://unix.stackexchange.com/a/324723 to be leading to the proper fix in fstab
#1 Updated by okurz over 2 years ago
- Status changed from New to In Progress
- Assignee set to okurz
- Target version set to Ready
#2 Updated by okurz over 2 years ago
- Status changed from In Progress to Feedback
Whoever, whatever - but osd rebooted at sunday 3am
yes, that is done automatically by "rebootmgr" whenever there are package upgrades that ask for a reboot, e.g. kernel upgrade. It also helps us to find these problems and not wait for the poor person that reboots the server only once every 3 years ;)
with a fixed /etc/fstab
To my understanding we only need dependencies for the directories in which we want to bind mount to because any mount point specified below an existing one already depends on all parent directory levels.
Also testing on lord.arch
EDIT: Test on lord.arch successful with:
UUID=5140113d-99e0-4244-beac-8787e56c0946 /var/lib/openqa btrfs subvol=@/var/lib/openqa 0 0 /abuild/pool /var/lib/openqa/pool none x-systemd.requires=/var/lib/openqa,x-systemd.automount,bind 0 0
I am wondering though if the additional dependencies are really necessary because currently on osd (without the change in /etc/fstab):
$ systemctl list-dependencies var-lib-openqa-share-factory-hdd-fixed.mount var-lib-openqa-share-factory-hdd-fixed.mount ● ├─-.mount ● ├─space\x2dslow.mount ● ├─system.slice ● ├─var-lib-openqa-share.mount ● └─var-lib-openqa.mount
so both the source and target parent mount points are there, what should be missing?
#3 Updated by okurz over 2 years ago
- Related to action #64941: after every reboot openqaworker7 is missing var-lib-openqa-share.mount , check dependencies of service with openqaworker1 added
#4 Updated by coolo over 2 years ago
Is this after a reboot? I changed the fstab to have the mount point on /var/lib/openqa/share/factory instead of /assets and umounted and mounted manually to work around the issue
#5 Updated by okurz over 2 years ago
The output of
systemctl list-dependencies var-lib-openqa-share-factory-hdd-fixed.mount was before any reboot, I did not trigger any.
#6 Updated by okurz over 2 years ago
- Due date set to 2020-11-08
I assume this should be ok but we should await some osd reboots, either manually induced or automatic.
#7 Updated by okurz over 2 years ago
- Status changed from Feedback to Resolved
Apparently was not enough, see https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/399
merged and deployed manually as gitlab CI runners are currently not picked up default (EngInfra was informed). After
git pull --rebase as root@osd in /srv/salt I called
salt --no-color -l error -C 'G@roles:webui' state.apply, triggered reboot to check as currently only 5 jobs are running.
Verified working fine and dependencies look ok, e.g. when calling
systemctl cat var-lib-openqa-share-factory-hdd-fixed.automount and others.