action #129340
closed
[regression] openqa cannot start jobs with symlinked assets size:M
Added by ph03nix over 1 year ago.
Updated over 1 year ago.
Category:
Regressions/Crashes
Description
Observation¶
When starting a job with an asset which is a symlink, openQA dies with the following error message
qemu-img: Could not open '/var/lib/openqa/pool/4/ignition.qcow2': Failed to lock byte 201
ph03nix wrote:
cdywan wrote:
Can you clarify e.g. is this a recent regression? Maybe if you have a job that used to work? If it wasn't already cleaned up in the meanwhile, adding this ticket would stop it from being removed.
This is a recent regression. I have a automated script that creates symlinks to assets, which used to work until recently it stopped working with exactly this issue.
I only have failures on my own openQA instance, but it's trivial to reproduce the issue (use a symlinked asset)
Steps to reproduce¶
- Create asset, which is a symlink
- Start a job with this asset as
HDD_1
variable (or HDD_2
, ...)
Impact¶
- Regression, this means we can not have symlinks as asset files
Problem¶
Suggestion¶
- Investigate what the actual problem is
- Try to reproduce the problem with os-autoinst only. If not reproducible check if this needs an openQA worker with or without enabled worker cache
Workarounds¶
- Use hard-links or copies of asset files
- Description updated (diff)
- Category set to Feature requests
- Target version set to future
As explained in chat there are workarounds to consider. I included them now in the description.
I'd argue that this is not a feature request but rather a bug because this was working before.
ph03nix wrote:
I'd argue that this is not a feature request but rather a bug because this was working before.
Can you clarify e.g. is this a recent regression? Maybe if you have a job that used to work? If it wasn't already cleaned up in the meanwhile, adding this ticket would stop it from being removed.
cdywan wrote:
Can you clarify e.g. is this a recent regression? Maybe if you have a job that used to work? If it wasn't already cleaned up in the meanwhile, adding this ticket would stop it from being removed.
This is a recent regression. I have a automated script that creates symlinks to assets, which used to work until recently it stopped working with exactly this issue.
I only have failures on my own openQA instance, but it's trivial to reproduce the issue (use a symlinked asset)
- Subject changed from openqa cannot start jobs with symlinked assets to [regression] openqa cannot start jobs with symlinked assets
- Category changed from Feature requests to Regressions/Crashes
- Target version changed from future to Ready
- Subject changed from [regression] openqa cannot start jobs with symlinked assets to [regression] openqa cannot start jobs with symlinked assets size:M
- Description updated (diff)
- Status changed from New to Workable
- Status changed from Workable to In Progress
- Assignee set to tinita
@ph03nix is there a difference for absolute and relative symlinks?
e.g. the link under HDD_1 points to another file with a relative or absolute path?
I just tested it, and can confirm that a relative path does not work (although it results in a different error), but with an absolute path it works.
The reason is that we create a (hard) link in the pool directory to the path in HDD_1, basically doing the equivalent of ln /path/to/hdd_filename /pool/hdd_filename
, and if /path/to/hdd_filename -> relfile
then that results in /pool/hdd_filename -> relfile
, because it just hardlinks a symlink.
I can't see any recent changes related to that.
So using absolute symlinks on your side should be a workaround for now.
Meanwhile I'm thinking about what would be the best to do on our side.
We fall back to symlinks if the hard link fails, anyway, so we could just check if the asset file is a symlink.
- Due date set to 2023-07-07
Setting due date based on mean cycle time of SUSE QE Tools
- Status changed from In Progress to Feedback
Hey Tina! Good job and thanks for the fix! AFAICS this should resolve it as a whole. Do you still need something from me?
@ph03nix I would only be curious if you can confirm my conclusion:
- Absolute symlinks have been working before my fix, only relative symlinks didn't
- I can't see any recent changes related to this, so it's not a regression
Then I can turn this into a feature request retroactively :)
tinita wrote:
@ph03nix I would only be curious if you can confirm my conclusion:
- Absolute symlinks have been working before my fix, only relative symlinks didn't
AFAICS my script always created relative symlinks and I'm still using relative symlinks there. Only recently I had to replace them by hardlinks.
- I can't see any recent changes related to this, so it's not a regression
I share your observation that there have not been recent changes, however this is in contradiction with my observation that a automated script which I haven't touched in a year at some point in the last months stopped working.
From my observation, this was a regression, but that's IMHO also a minor and irrelevant taxonomical detail, as long as it's fixed ;-)
Then I can turn this into a feature request retroactively :)
No objections from my side, and thanks for the fix :-)
- Status changed from Feedback to Resolved
ph03nix wrote:
I share your observation that there have not been recent changes, however this is in contradiction with my observation that a automated script which I haven't touched in a year at some point in the last months stopped working.
Ok. One last question: How many months could "last months" be roughly?
We didn't have tests for relative symlinks, so it's at least possible that there is some other place that could have influenced that behaviour, so I would be curious, but only if I have a more concrete time frame to look into :)
I will resolve this now in any case. Will keep it categorized as a bug.
tinita wrote:
Ok. One last question: How many months could "last months" be roughly?
3-6 months. I wish I could give you a more accurate time window but I can't :-(
We didn't have tests for relative symlinks, so it's at least possible that there is some other place that could have influenced that behaviour, so I would be curious, but only if I have a more concrete time frame to look into :)
I fully understand, unfortunately I can't tell because I didn't used the affected symlinked assets on my test instance in a long time.
Also available in: Atom
PDF