action #94949
closedFailed systemd services alert for openqaworker3 var-lib-openqa-share.automount
Added by livdywan almost 4 years ago. Updated almost 4 years ago.
0%
Description
2021-06-30 14:48:00 openqaworker3 var-lib-openqa-share.automount 1
Logging into the machine shows this:
> systemctl --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● var-lib-openqa-share.automount loaded failed failed var-lib-openqa-share.automount
> systemctl status var-lib-openqa-share.automount
● var-lib-openqa-share.automount
Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
Active: failed (Result: resources) since Tue 2021-06-29 14:12:38 CEST; 24h ago
Where: /var/lib/openqa/share
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Might be related to #94919 in some way? Unfortunately I can't deduce what caused it from the status.
Updated by okurz almost 4 years ago
- Related to action #94919: All arm workers down 2021-06-30 , NUE SRV2 Rack A8 was switched off by EngInfra size:S added
Updated by okurz almost 4 years ago
- Status changed from New to Blocked
- Assignee set to okurz
- Target version set to Ready
Well, the mount point is related to NFS which is related to network so I would say it can be related for sure :)
Updated by mkittler almost 4 years ago
Ah, now you've assigned yourself to it. I was also looking at the problem. It seems that the normal mount and the automount are interfering with each other:
martchus@openqaworker3:~> sudo systemctl status var-lib-openqa-share.automount
● var-lib-openqa-share.automount
Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
Active: failed (Result: resources) since Tue 2021-06-29 14:12:38 CEST; 1 day 1h ago
Where: /var/lib/openqa/share
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Jun 28 04:04:34 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 32652 (/usr/bin/isotov)
Jun 28 04:04:35 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 32581 (/usr/bin/isotov)
Jun 28 04:04:36 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 32581 (/usr/bin/isotov)
Jun 28 12:04:30 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 31589 (/usr/bin/isotov)
Jun 29 05:04:47 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 3055 (/usr/bin/isotov)
Jun 29 05:04:47 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 3118 (/usr/bin/isotov)
Jun 29 14:12:38 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Got invalid poll event 16 on pipe (fd=152)
Jun 29 14:12:38 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Unit entered failed state.
Jun 30 14:59:15 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Path /var/lib/openqa/share is already a mount point, refusing start.
Jun 30 14:59:15 openqaworker3 systemd[1]: Failed to set up automount var-lib-openqa-share.automount.
martchus@openqaworker3:~> sudo systemctl status var-lib-openqa-share.mount
● var-lib-openqa-share.mount - /var/lib/openqa/share
Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
Active: active (mounted) since Wed 2021-06-30 15:31:23 CEST; 6min ago
Where: /var/lib/openqa/share
What: openqa.suse.de:/var/lib/openqa/share
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Process: 5015 ExecMount=/usr/bin/mount openqa.suse.de:/var/lib/openqa/share /var/lib/openqa/share -t nfs -o retry=30,ro,x-systemd.automount,x-systemd.mount-timeout=30m (code=exited, status=0/SUCCESS)
Maybe the problem has been introduced by baf1e6dd1f5efb7ce9d4064d9ef841a18fa56064.
Is it really blocked by #94919? This is on openqaworker3 (and not ARM workers) and it doesn't seem to be only a networking issue considering that the normal mount unit works and the NFS mount can be accessed.
Updated by okurz almost 4 years ago
- Status changed from Blocked to New
- Assignee changed from okurz to mkittler
@mkittler in that case you can take over of course. I would have merely waited for the network problems to be resolved before checking again
Updated by mkittler almost 4 years ago
The problem is not reproducible on other workers, e.g.:
martchus@openqaworker2:~> sudo systemctl status var-lib-openqa-share.mount
● var-lib-openqa-share.mount - /var/lib/openqa/share
Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
Active: active (mounted) since Thu 2021-07-01 09:43:55 CEST; 58min ago
Where: /var/lib/openqa/share
What: openqa.suse.de:/var/lib/openqa/share
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Process: 23969 ExecMount=/usr/bin/mount openqa.suse.de:/var/lib/openqa/share /var/lib/openqa/share -t nfs -o retry=30,ro,x-systemd.automount,x-systemd.mount-timeout=30m (code=exited, status=0/SUCCESS)
Tasks: 0
CGroup: /system.slice/var-lib-openqa-share.mount
martchus@openqaworker2:~>
martchus@openqaworker2:~> sudo systemctl status var-lib-openqa-share.automount
● var-lib-openqa-share.automount
Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
Active: active (running) since Sun 2021-06-20 03:36:04 CEST; 1 weeks 4 days ago
Where: /var/lib/openqa/share
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Jun 24 05:51:42 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 29547 (/usr/bin/isotov)
Jun 24 05:51:43 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 29547 (/usr/bin/isotov)
Jun 24 09:51:40 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 20437 (/usr/bin/isotov)
Jun 24 09:51:40 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 20437 (/usr/bin/isotov)
Jun 24 11:51:33 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 19864 (/usr/bin/isotov)
Jun 24 12:51:34 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 3388 (/usr/bin/isotov)
Jun 24 16:27:02 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 21540 (/usr/bin/isotov)
Jun 25 03:41:15 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 27631 (/usr/bin/isotov)
Jun 25 16:41:20 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 3927 (/usr/bin/isotov)
Jun 29 14:11:00 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 23038 (worker)
martchus@openqaworker2:~> sudo systemctl restart var-lib-openqa-share.automount
martchus@openqaworker2:~> sudo systemctl status var-lib-openqa-share.automount
● var-lib-openqa-share.automount
Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
Active: active (waiting) since Thu 2021-07-01 12:01:17 CEST; 4s ago
Where: /var/lib/openqa/share
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Jul 01 12:01:17 openqaworker2 systemd[1]: Unset automount var-lib-openqa-share.automount.
Jul 01 12:01:17 openqaworker2 systemd[1]: Stopping var-lib-openqa-share.automount.
Jul 01 12:01:17 openqaworker2 systemd[1]: Set up automount var-lib-openqa-share.automount.
martchus@openqaworker2:~> sudo systemctl stop var-lib-openqa-share.automount
martchus@openqaworker2:~> sudo systemctl start var-lib-openqa-share.automount
martchus@openqaworker2:~> sudo systemctl status var-lib-openqa-share.automount
● var-lib-openqa-share.automount
Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
Active: active (waiting) since Thu 2021-07-01 12:01:36 CEST; 2s ago
Where: /var/lib/openqa/share
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Jul 01 12:01:36 openqaworker2 systemd[1]: Set up automount var-lib-openqa-share.automount.
martchus@openqaworker2:~> sudo systemctl status var-lib-openqa-share.automount
● var-lib-openqa-share.automount
Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
Active: active (waiting) since Thu 2021-07-01 12:01:36 CEST; 30s ago
Where: /var/lib/openqa/share
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Jul 01 12:01:36 openqaworker2 systemd[1]: Set up automount var-lib-openqa-share.automount.
martchus@openqaworker2:~> ls -l /var/lib/openqa/share/factory
insgesamt 3608
drwxr-xr-x 3 rbrown root 393216 1. Jul 11:54 hdd
drwxr-xr-x 3 rbrown root 114688 1. Jul 02:19 iso
drwxr-xr-x 2 rbrown root 44171264 1. Jul 12:00 other
drwxr-xr-x 2666 rbrown root 380928 1. Jul 12:00 repo
drwxrwxrwt 4 root root 4096 1. Jul 12:02 tmp
martchus@openqaworker2:~> sudo systemctl status var-lib-openqa-share.automount
● var-lib-openqa-share.automount
Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
Active: active (running) since Thu 2021-07-01 12:01:36 CEST; 58s ago
Where: /var/lib/openqa/share
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Jul 01 12:01:36 openqaworker2 systemd[1]: Set up automount var-lib-openqa-share.automount.
Jul 01 12:02:29 openqaworker2 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 16351 (bash)
But on openqaworker3
we get:
martchus@openqaworker3:~> sudo systemctl stop var-lib-openqa-share.automount
martchus@openqaworker3:~> sudo systemctl start var-lib-openqa-share.automount
Job for var-lib-openqa-share.automount failed.
See "systemctl status var-lib-openqa-share.automount" and "journalctl -xe" for details.
martchus@openqaworker3:~> sudo systemctl status var-lib-openqa-share.automount
● var-lib-openqa-share.automount
Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
Active: inactive (dead) since Tue 2021-06-29 14:12:38 CEST; 1 day 21h ago
Where: /var/lib/openqa/share
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Jun 29 05:04:47 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 3055 (/usr/bin/isotov)
Jun 29 05:04:47 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Got automount request for /var/lib/openqa/share, triggered by 3118 (/usr/bin/isotov)
Jun 29 14:12:38 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Got invalid poll event 16 on pipe (fd=152)
Jun 29 14:12:38 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Unit entered failed state.
Jun 30 14:59:15 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Path /var/lib/openqa/share is already a mount point, refusing start.
Jun 30 14:59:15 openqaworker3 systemd[1]: Failed to set up automount var-lib-openqa-share.automount.
Jul 01 11:11:13 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Path /var/lib/openqa/share is already a mount point, refusing start.
Jul 01 11:11:13 openqaworker3 systemd[1]: Failed to set up automount var-lib-openqa-share.automount.
Jul 01 12:04:26 openqaworker3 systemd[1]: var-lib-openqa-share.automount: Path /var/lib/openqa/share is already a mount point, refusing start.
Jul 01 12:04:26 openqaworker3 systemd[1]: Failed to set up automount var-lib-openqa-share.automount.
Looks like it at least nevertheless doesn't enter the failed state again (after I resetted it via sudo systemctl reset-failed
).
The /etc/fstab
entry and the generated systemd units are identical on both hosts.
Updated by okurz almost 4 years ago
- Related to action #93964: salt-states CI pipeline deploy step fails on some workers with "Unable to unmount /var/lib/openqa/share: umount.nfs: /var/lib/openqa/share: device is busy." added
Updated by okurz almost 4 years ago
- Status changed from New to Resolved
- Assignee changed from mkittler to okurz
#94919 is resolved. I just checked https://monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services?orgId=1 and it's all green. I accept the hypothessis that since the network problems have been fixed and the mounting works again that this was the explanation for the problems we observed.
Updated by okurz 10 months ago
- Related to action #163097: Share mount not working on openqaworker-arm-1 and other workers size:M added