Project

General

Profile

action #73501

Bind mounts of fixed assets is racy

Added by coolo about 1 year ago. Updated 12 months ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
2020-10-19
Due date:
2020-11-08
% Done:

0%

Estimated time:

Description

Whoever, whatever - but osd rebooted at sunday 3am and that had all fixed assets missing. The reason was that the bind mount for /assets/hdd/fixed happened before /assets was actually mounted, leaving an empty fixed folder afterwards. I fixed this manually now, but to fix this for future reboots, I found https://unix.stackexchange.com/a/324723 to be leading to the proper fix in fstab


Related issues

Related to openQA Infrastructure - action #64941: after every reboot openqaworker7 is missing var-lib-openqa-share.mount , check dependencies of service with openqaworker1Resolved2020-03-272021-06-11

History

#1 Updated by okurz about 1 year ago

  • Status changed from New to In Progress
  • Assignee set to okurz
  • Target version set to Ready

#2 Updated by okurz about 1 year ago

  • Status changed from In Progress to Feedback

coolo wrote:

Whoever, whatever - but osd rebooted at sunday 3am

yes, that is done automatically by "rebootmgr" whenever there are package upgrades that ask for a reboot, e.g. kernel upgrade. It also helps us to find these problems and not wait for the poor person that reboots the server only once every 3 years ;)

Created
https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/382
with a fixed /etc/fstab

To my understanding we only need dependencies for the directories in which we want to bind mount to because any mount point specified below an existing one already depends on all parent directory levels.

Also testing on lord.arch

EDIT: Test on lord.arch successful with:

UUID=5140113d-99e0-4244-beac-8787e56c0946 /var/lib/openqa btrfs subvol=@/var/lib/openqa 0 0
/abuild/pool /var/lib/openqa/pool none x-systemd.requires=/var/lib/openqa,x-systemd.automount,bind 0 0

I am wondering though if the additional dependencies are really necessary because currently on osd (without the change in /etc/fstab):

$ systemctl list-dependencies var-lib-openqa-share-factory-hdd-fixed.mount
var-lib-openqa-share-factory-hdd-fixed.mount
● ├─-.mount
● ├─space\x2dslow.mount
● ├─system.slice
● ├─var-lib-openqa-share.mount
● └─var-lib-openqa.mount

so both the source and target parent mount points are there, what should be missing?

#3 Updated by okurz about 1 year ago

  • Related to action #64941: after every reboot openqaworker7 is missing var-lib-openqa-share.mount , check dependencies of service with openqaworker1 added

#4 Updated by coolo about 1 year ago

Is this after a reboot? I changed the fstab to have the mount point on /var/lib/openqa/share/factory instead of /assets and umounted and mounted manually to work around the issue

#5 Updated by okurz about 1 year ago

The output of systemctl list-dependencies var-lib-openqa-share-factory-hdd-fixed.mount was before any reboot, I did not trigger any.

#6 Updated by okurz about 1 year ago

  • Due date set to 2020-11-08

I assume this should be ok but we should await some osd reboots, either manually induced or automatic.

#7 Updated by okurz 12 months ago

  • Status changed from Feedback to Resolved

Apparently was not enough, see https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/399

merged and deployed manually as gitlab CI runners are currently not picked up default (EngInfra was informed). After git pull --rebase as root@osd in /srv/salt I called salt --no-color -l error -C 'G@roles:webui' state.apply, triggered reboot to check as currently only 5 jobs are running.

Verified working fine and dependencies look ok, e.g. when calling systemctl cat var-lib-openqa-share-factory-hdd-fixed.automount and others.

Also available in: Atom PDF