salt-states CI pipeline deploy step fails on some workers with "Unable to unmount /var/lib/openqa/share: umount.nfs: /var/lib/openqa/share: device is busy."
openqaworker-arm-3.suse.de: ---------- ID: /var/lib/openqa/share Function: mount.mounted Result: False Comment: Unable to unmount /var/lib/openqa/share: umount.nfs: /var/lib/openqa/share: device is busy. Started: 17:15:35.643876 Duration: 271.875 ms Changes: ---------- umount: Forced unmount and mount because options (noauto) changed ----------
Maybe we need to manually ensure all workers have the options updated and be rebooted? Not even sure what makes the share
/var/lib/openqa/share busy as normally we should only use caching over rsync and http from osd, not NFS.
- AC1: Stable deployments also after reboot of multiple worker machines
- Status changed from New to In Progress
- Assignee set to okurz
Just observed that now again in https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/492075#L1872
- Due date set to 2021-07-27
- Status changed from In Progress to Feedback
Seems like salt detects changes all the time because it compares the content of /etc/fstab with the output of the command "mount". This is explained in https://github.com/saltstack/salt/issues/18630#issuecomment-342486325 for different mount parameters which trigger the same. The option "x-systemd.mount-timeout=30m" is correctly included in /etc/fstab but the mount point entry when calling "mount" does not have it as these special parameters are only read by systemd.
So I found a better approach now with https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/526
- Status changed from Feedback to Resolved
MR merged. Now checking other pipelines.
Found other failures fixed in https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/527