coordination #117673
opencoordination #154777: [saga][epic] Shareable os-autoinst and test distribution plugins
coordination #108527: [epic] os-autoinst wheels for scalable code reuse of helper functions and segmented test distributions
[epic][tools] sporadic "Unable to clone Git repository" for wheels
50%
Description
Observation¶
Every so often, the plugins fail to clone a repository for instance #117622#note-
https://openqa.suse.de/tests/9668977#
[2022-10-06T02:01:46.761898+02:00] [info] ::: OpenQA::Isotovideo::Utils::checkout_git_repo_and_branch: Cloning git URL 'https://github.com/Zaoliang/functional_wheel'
[2022-10-06T02:01:52.973889+02:00] [debug] Cloning into 'functional_wheel'...
fatal: unable to access 'https://github.com/Zaoliang/functional_wheel/': OpenSSL SSL_connect: Connection reset by peer in connection to github.com:443
Acceptance criteria¶
- AC1: A Plugin can be loaded on demand (maybe requires a bit more thinking, and a bit more of thought into the design)
- AC2: A GH token can be used for authenticated requests (I suspect that we're hitting the rate limit here)
- AC3: Wheels cloning is retried (A maximum retry of N times can be configured in the wheels.yaml)
Updated by szarate about 2 years ago
- Copied from action #117622: [qe-core] Unable to clone Git repository for wheels added
Updated by szarate about 2 years ago
- Subject changed from [tools] Unable to clone Git repository for wheels to [tools] sporadic "Unable to clone Git repository" for wheels
I think implementing 2 and 3 should be a good enough solution for starters
Updated by okurz about 2 years ago
- Category changed from Regressions/Crashes to Feature requests
Updated by MDoucha about 2 years ago
- Category changed from Feature requests to Regressions/Crashes
@okurz: We're reporting here that this new thing is randomly breaking unrelated tests. In what way is that a "feature request"?
Updated by livdywan about 2 years ago
- Category changed from Regressions/Crashes to Feature requests
MDoucha wrote:
@okurz: We're reporting here that this new thing is randomly breaking unrelated tests. In what way is that a "feature request"?
This isn't a regression, hence the category Feature requests.
Updated by szarate about 2 years ago
cdywan wrote:
MDoucha wrote:
@okurz: We're reporting here that this new thing is randomly breaking unrelated tests. In what way is that a "feature request"?
This isn't a regression, hence the category Feature requests.
I'd argue it's a bug in the implementation, AC1 and AC2 are more of feature requests, than AC3 (which is a workaround/solution to the bug)... but semantics is a different thing, for now the possibility of having 403 HTTP Errors, blocks anybody from being able to properly use the plugin system for the test distribution.
Updated by livdywan about 2 years ago
Keep in mind that the category is based on what's known to work in released code, not what is desirable or common sense.
How often does this occur? Maybe the prio should be raised? Worst case the wheels.yaml could be dropped temporarily to avoid affecting many tests.
Updated by MDoucha about 2 years ago
cdywan wrote:
How often does this occur? Maybe the prio should be raised? Worst case the wheels.yaml could be dropped temporarily to avoid affecting many tests.
It happens each time GitHub has an outage or OpenQA workers lose network access to the outside world. If GitHub goes down for 10 minutes while the OpenQA queue is full of livepatch tests, we'll end up with 8000+ failed jobs because of that.
Updated by okurz about 2 years ago
[…] for now the possibility of having 403 HTTP Errors, blocks anybody from being able to properly use the plugin system for the test distribution.
I agree. I suggest to actually disable the use of wheels in os-autoinst-distri-opensuse until this has been resolved.
Updated by okurz about 2 years ago
- Tracker changed from action to coordination
- Subject changed from [tools] sporadic "Unable to clone Git repository" for wheels to [epic][tools] sporadic "Unable to clone Git repository" for wheels
- Description updated (diff)
- Assignee set to okurz
Updated by livdywan about 2 years ago
- Copied to action #118903: Repositories for wheels should be cached added
Updated by okurz about 2 years ago
- Status changed from Blocked to New
- Assignee deleted (
okurz) - Target version changed from Ready to future
For now first continuing with some other subtasks in the parent #108527 before we can reconsider this
Updated by okurz almost 2 years ago
https://github.com/os-autoinst/os-autoinst/pull/2238 was just merged so os-autoinst will now retry on wheels cloning issues, one of the reasons why wheels usage was pulled temporarily from os-autoinst-distri-opensuse. This should cover AC3.