action #109046
closed
[tools] auto_review:"Unable to find image SLES15-SP3-JeOS.x86_64-15.3-kvm-and-xen-GM.qcow2.*svirt":retry
Added by jlausuch over 2 years ago.
Updated over 2 years ago.
Category:
Bugs in existing tests
Files
Related issues
1 (1 open — 0 closed)
Seems like a NFS issue, normally re-mount usually help. However, this time it has not.
openqaw5-xen:~ # find /var/lib/openqa/share/factory/hdd /var/lib/openqa/share/factory/hdd/fixed -name SLES15-SP3-JeOS.x86_64-15.3-kvm-and-xen-GM.qcow2
find: ‘/var/lib/openqa/share/factory/hdd/fixed’: Stale file handle
find: ‘/var/lib/openqa/share/factory/hdd/fixed’: Stale file handle
mloviska wrote:
Seems like a NFS issue, normally re-mount usually help. However, this time it has not.
openqaw5-xen:~ # find /var/lib/openqa/share/factory/hdd /var/lib/openqa/share/factory/hdd/fixed -name SLES15-SP3-JeOS.x86_64-15.3-kvm-and-xen-GM.qcow2
find: ‘/var/lib/openqa/share/factory/hdd/fixed’: Stale file handle
find: ‘/var/lib/openqa/share/factory/hdd/fixed’: Stale file handle
Yes, I also have the same problem mounting the NFS on my environment. Must be that.
This was happening since Sunday. I tried to investigate a bit, but I only found out that the fixed directory was owned by root
and with all permissions:
drwxrwxrwx 2 geekotest root 49152 Mar 24 03:05 fixed
So, I changed it to geekotest:nogroup
and with standard permissions (as the parent dir hdd
):
drwxr-xr-x 2 geekotest nogroup 49152 Mar 24 03:05 fixed
However, that didn't solve the problem, so it must be something else.
- Subject changed from Unable to find image defined in HDD_1 in fixed directory for svirt backend to [qe-core] Unable to find image defined in HDD_1 in fixed directory for svirt backend
As you stated, the file does exist on osd and osd serves the directory over NFS so the problem is likely on the hypervisor host. I assume if you don't pick it up for qac then you expect qe-core to look at that.
okurz wrote:
As you stated, the file does exist on osd and osd serves the directory over NFS so the problem is likely on the hypervisor host. I assume if you don't pick it up for qac then you expect qe-core to look at that.
Ok, thanks. Maybe restarting NFS service would help?
Seems that after mounting NFS, fixed
dir is giving that issue Martin mentioned.
- Subject changed from [qe-core] Unable to find image defined in HDD_1 in fixed directory for svirt backend to [tools] Unable to find image defined in HDD_1 in fixed directory for svirt backend
- Assignee set to okurz
- Target version set to Ready
- Copied to action #109085: [qe-core] Ensure openqaw5-xen.qa.suse.de and potentially other hypervisor hosts OSs are updated to prevent NFS or other problems added
- Status changed from Workable to Resolved
okurz wrote:
According to my research, e.g. https://unix.stackexchange.com/questions/433051/mount-nfs-stale-file-handle-error-cannot-umount , this is a situation which can just happen. The best solution I saw right now is to check periodically in a cron job on the client. For that I think the situation does not appear often enough. https://unix.stackexchange.com/a/447581 suggests to call exportfs -ua && exportfs -a
on the server. I did that right now but I wonder when this should be done automatically.
https://unix.stackexchange.com/a/433071 suggests that the problem might be due to an outdated NFS4 client. So maybe the best course of action would be to ensure that openqaw5-xen itself is updated to a more current OS -> Created a specific ticket about that to "[qe-core]" to handle that in #109085
Ok, that makes sense. Thanks for taking care.
Btw, on my client side, I can now access fixed
dir.
- Subject changed from [tools] Unable to find image defined in HDD_1 in fixed directory for svirt backend to [tools] auto_review:"Unable to find image SLES15-SP3-JeOS.x86_64-15.3-kvm-and-xen-GM.qcow2.*svirt":retry
- Status changed from Resolved to Feedback
jlausuch wrote:
okurz wrote:
According to my research, e.g. https://unix.stackexchange.com/questions/433051/mount-nfs-stale-file-handle-error-cannot-umount , this is a situation which can just happen. The best solution I saw right now is to check periodically in a cron job on the client. For that I think the situation does not appear often enough. https://unix.stackexchange.com/a/447581 suggests to call exportfs -ua && exportfs -a
on the server. I did that right now but I wonder when this should be done automatically.
https://unix.stackexchange.com/a/433071 suggests that the problem might be due to an outdated NFS4 client. So maybe the best course of action would be to ensure that openqaw5-xen itself is updated to a more current OS -> Created a specific ticket about that to "[qe-core]" to handle that in #109085
Ok, that makes sense. Thanks for taking care.
Btw, on my client side, I can now access fixed
dir.
Good to hear that.
Actually reopening, running a manual run of
export host=openqa.suse.de; bash -ex ./openqa-monitor-investigation-candidates | bash -e ./openqa-label-known-issues
Also labeled and retriggered multiple jobs manually.
- Description updated (diff)
- Status changed from Feedback to Resolved
$ openqa-query-for-job-label poo#109046
8429356|2022-03-29 07:53:52|done|failed|msdos||openqaworker2
8429355|2022-03-29 07:53:47|done|failed|minimal+base_yast||openqaworker2
8429350|2022-03-29 07:53:25|done|failed|lvm+RAID1||openqaworker2
8429351|2022-03-29 07:53:24|done|failed|minimal+base_yast||openqaworker2
8421641|2022-03-28 11:06:39|done|failed|msdos||openqaworker2
8421429|2022-03-28 10:41:33|done|failed|minimal+base_yast||openqaworker2
8421428|2022-03-28 10:35:50|done|failed|lvm+RAID1||openqaworker2
8420964|2022-03-28 10:29:16|done|failed|minimal+base_yast||openqaworker2
8419261|2022-03-28 09:32:13|done|failed|jeos-extratest||openqaworker2
8419259|2022-03-28 09:29:38|done|failed|jeos-filesystem||openqaworker2
looks good now
Also available in: Atom
PDF