action #73501
closedBind mounts of fixed assets is racy
0%
Description
Whoever, whatever - but osd rebooted at sunday 3am and that had all fixed assets missing. The reason was that the bind mount for /assets/hdd/fixed happened before /assets was actually mounted, leaving an empty fixed folder afterwards. I fixed this manually now, but to fix this for future reboots, I found https://unix.stackexchange.com/a/324723 to be leading to the proper fix in fstab
Updated by okurz almost 4 years ago
- Status changed from New to In Progress
- Assignee set to okurz
- Target version set to Ready
Updated by okurz almost 4 years ago
- Status changed from In Progress to Feedback
coolo wrote:
Whoever, whatever - but osd rebooted at sunday 3am
yes, that is done automatically by "rebootmgr" whenever there are package upgrades that ask for a reboot, e.g. kernel upgrade. It also helps us to find these problems and not wait for the poor person that reboots the server only once every 3 years ;)
Created
https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/382
with a fixed /etc/fstab
To my understanding we only need dependencies for the directories in which we want to bind mount to because any mount point specified below an existing one already depends on all parent directory levels.
Also testing on lord.arch
EDIT: Test on lord.arch successful with:
UUID=5140113d-99e0-4244-beac-8787e56c0946 /var/lib/openqa btrfs subvol=@/var/lib/openqa 0 0
/abuild/pool /var/lib/openqa/pool none x-systemd.requires=/var/lib/openqa,x-systemd.automount,bind 0 0
I am wondering though if the additional dependencies are really necessary because currently on osd (without the change in /etc/fstab):
$ systemctl list-dependencies var-lib-openqa-share-factory-hdd-fixed.mount
var-lib-openqa-share-factory-hdd-fixed.mount
● ├─-.mount
● ├─space\x2dslow.mount
● ├─system.slice
● ├─var-lib-openqa-share.mount
● └─var-lib-openqa.mount
so both the source and target parent mount points are there, what should be missing?
Updated by okurz almost 4 years ago
- Related to action #64941: after every reboot openqaworker7 is missing var-lib-openqa-share.mount , check dependencies of service with openqaworker1 added
Updated by coolo almost 4 years ago
Is this after a reboot? I changed the fstab to have the mount point on /var/lib/openqa/share/factory instead of /assets and umounted and mounted manually to work around the issue
Updated by okurz almost 4 years ago
The output of systemctl list-dependencies var-lib-openqa-share-factory-hdd-fixed.mount
was before any reboot, I did not trigger any.
Updated by okurz almost 4 years ago
- Due date set to 2020-11-08
I assume this should be ok but we should await some osd reboots, either manually induced or automatic.
Updated by okurz almost 4 years ago
- Status changed from Feedback to Resolved
Apparently was not enough, see https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/399
merged and deployed manually as gitlab CI runners are currently not picked up default (EngInfra was informed). After git pull --rebase
as root@osd in /srv/salt I called salt --no-color -l error -C 'G@roles:webui' state.apply
, triggered reboot to check as currently only 5 jobs are running.
Verified working fine and dependencies look ok, e.g. when calling systemctl cat var-lib-openqa-share-factory-hdd-fixed.automount
and others.