action #154156
closedcoordination #58184: [saga][epic][use case] full version control awareness within openQA
coordination #152847: [epic] version control awareness within openQA for test distributions
[spike][timeboxed:10h] Cache test distributions from git on production size:S
Added by okurz 11 months ago. Updated 10 months ago.
Description
Motivation¶
As part of #138029 support for git caching was included in os-autoinst covering test distributions as well as wheel repository. We can now look into caching test distributions from git fully enabled on a production openQA instance.
Acceptance criteria¶
- AC1: At least one test distribution on openqa.opensuse.org uses git caching successfully
Suggestions¶
- Read about
GIT_CACHE_DIR
in https://github.com/os-autoinst/os-autoinst/blob/master/doc/backend_vars.asciidoc and experiment with that in an openQA environment - Experiment how the git cache dir will be populated for multiple openQA jobs running in the same environment
- Try out multiple openQA jobs running in parallel relying on access to GIT_CACHE_DIR
- Possibly use it directly on o3 and monitor the impact
Out of scope¶
- Manage storage capacity long-term / clean-up service
- Considering how the worker cache service interacts
Updated by okurz 11 months ago
- Copied from action #138029: [research][timeboxed:10h] How to cache "wheel" repositories which are stored on github size:M added
Updated by okurz 11 months ago
- Copied to action #154237: [spike][timeboxed:10h] Ensure the worker cache doesn't duplicate git caching of test distributions on o3 size:S added
Updated by okurz 11 months ago
- Related to action #154240: Ensure cloning openQA jobs with GIT_CACHE_DIR works in usual use cases added
Updated by mkittler 11 months ago
With https://github.com/os-autoinst/openQA/pull/5438 #154240 should be done but for doing the acceptance tests it would make sense to implement that spike ticket first. (Both tickets are really intervened and splitting them up was likely not very useful.)
Updated by mkittler 11 months ago · Edited
I enabled Git caching on openqaworker21 via:
mkdir -p /var/lib/openqa/cache/git && chown _openqa-worker:nogroup /var/lib/openqa/cache/git && bash -c "grep -q GIT_CACHE_DIR /etc/openqa/workers.ini || sed -i '/CACHEDIRECTORY/a GIT_CACHE_DIR = /var/lib/openqa/cache/git' /etc/openqa/workers.ini"
and cloned a test job:
openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org/tests/3902538 _GROUP=0 BUILD+=-test-for-poo-154156 WORKER_CLASS+=,openqaworker21 CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-opensuse.git
1 job has been created:
- microos-Tumbleweed-DVD-x86_64-Build20240129-microos@64bit -> https://openqa.opensuse.org/tests/3903427
It seems to work:
[2024-01-30T16:00:20.179490Z] [debug] [pid:71462] Current version is 4.6.1706517008.1bcd6e7 [interface v40]
[2024-01-30T16:00:20.187742Z] [info] [pid:71462] ::: OpenQA::Isotovideo::Utils::clone_git: Cloning git URL 'https://github.com/os-autoinst/os-autoinst-distri-opensuse.git' into '/var/lib/openqa/pool/12'
[2024-01-30T16:00:20.189092Z] [info] [pid:71462] ::: OpenQA::Isotovideo::Utils::_clone_bare_repo: Creating bare repository for caching 'https://github.com/os-autoinst/os-autoinst-distri-opensuse.git' under '/var/lib/openqa/cache/git//os-autoinst/os-autoinst-distri-opensuse.git'
[2024-01-30T16:00:50.256431Z] [debug] [pid:71462] Cloning into bare repository '/var/lib/openqa/cache/git//os-autoinst/os-autoinst-distri-opensuse.git'...
[2024-01-30T16:00:50.256745Z] [info] [pid:71462] ::: OpenQA::Isotovideo::Utils::_fetch_new_refs: Updating Git cache for 'https://github.com/os-autoinst/os-autoinst-distri-opensuse.git' under '/var/lib/openqa/cache/git//os-autoinst/os-autoinst-distri-opensuse.git'
[2024-01-30T16:00:50.811681Z] [debug] [pid:71462] From https://github.com/os-autoinst/os-autoinst-distri-opensuse
* branch HEAD -> FETCH_HEAD
[2024-01-30T16:00:53.372112Z] [debug] [pid:71462] Cloning into 'os-autoinst-distri-opensuse'...
[2024-01-30T16:00:53.382681Z] [debug] [pid:71462] git hash in '/var/lib/openqa/pool/12/os-autoinst-distri-opensuse': b5b0e1cc1ad8e8187bae931f7909527ea8ce5b3e
[2024-01-30T16:00:53.400748Z] [debug] [pid:71462] git url in '/var/lib/openqa/pool/12/os-autoinst-distri-opensuse': /var/lib/openqa/cache/git//os-autoinst/os-autoinst-distri-opensuse.git
I created the command to configure this in a way so we could easily run it on all workers. Not sure whether it would actually make sense to enable this on all o3 workers at this point, though. Without any cleanup it is probably a bad idea. Maybe that's something I could look into as part of this ticket as we gave it 10 hours and I have just spent 15 minutes :-)
Updated by okurz 11 months ago
This looks promising:
[2024-01-30T16:00:20.187742Z] [info] [pid:71462] ::: OpenQA::Isotovideo::Utils::clone_git: Cloning git URL 'https://github.com/os-autoinst/os-autoinst-distri-opensuse.git' into '/var/lib/openqa/pool/12'
[2024-01-30T16:00:20.189092Z] [info] [pid:71462] ::: OpenQA::Isotovideo::Utils::_clone_bare_repo: Creating bare repository for caching 'https://github.com/os-autoinst/os-autoinst-distri-opensuse.git' under '/var/lib/openqa/cache/git//os-autoinst/os-autoinst-distri-opensuse.git'
[2024-01-30T16:00:50.256431Z] [debug] [pid:71462] Cloning into bare repository '/var/lib/openqa/cache/git//os-autoinst/os-autoinst-distri-opensuse.git'...
[2024-01-30T16:00:50.256745Z] [info] [pid:71462] ::: OpenQA::Isotovideo::Utils::_fetch_new_refs: Updating Git cache for 'https://github.com/os-autoinst/os-autoinst-distri-opensuse.git' under '/var/lib/openqa/cache/git//os-autoinst/os-autoinst-distri-opensuse.git'
[2024-01-30T16:00:50.811681Z] [debug] [pid:71462] From https://github.com/os-autoinst/os-autoinst-distri-opensuse
* branch HEAD -> FETCH_HEAD
[2024-01-30T16:00:53.372112Z] [debug] [pid:71462] Cloning into 'os-autoinst-distri-opensuse'...
[2024-01-30T16:00:53.382681Z] [debug] [pid:71462] git hash in '/var/lib/openqa/pool/12/os-autoinst-distri-opensuse': b5b0e1cc1ad8e8187bae931f7909527ea8ce5b3e
[2024-01-30T16:00:53.400748Z] [debug] [pid:71462] git url in '/var/lib/openqa/pool/12/os-autoinst-distri-opensuse': /var/lib/openqa/cache/git//os-autoinst/os-autoinst-distri-opensuse.git
Updated by tinita 11 months ago
One thing to note:
https://openqa.opensuse.org/tests/3903427/file/vars.json
"TEST_GIT_URL" : "/var/lib/openqa/cache/git//os-autoinst/os-autoinst-distri-opensuse.git",
We might want to make the generation of TEST_GIT_URL a bit more intelligent.
Or we make openqa-investigate more intelligent to figure out the web url for that.
(Also maybe we should try to get rid of the double slash. Just because it hurts my eye ;-)
Updated by openqa_review 11 months ago
- Due date set to 2024-02-14
Setting due date based on mean cycle time of SUSE QE Tools
Updated by okurz 11 months ago · Edited
tinita wrote in #note-13:
One thing to note:
https://openqa.opensuse.org/tests/3903427/file/vars.json"TEST_GIT_URL" : "/var/lib/openqa/cache/git//os-autoinst/os-autoinst-distri-opensuse.git",
We might want to make the generation of TEST_GIT_URL a bit more intelligent.
Or we make openqa-investigate more intelligent to figure out the web url for that.
How about setting
TEST_GIT_ORIG_URL=TEST_GIT_URL if GIT_CACHE_DIR
or the inverse: Never update TEST_GIT_URL but save an additional TEST_GIT_CACHE_URL if GIT_CACHE_DIR
Updated by mkittler 11 months ago
This PR will restore the behavior regarding those variables: https://github.com/os-autoinst/os-autoinst/pull/2452
Updated by mkittler 11 months ago
Here's how basic cleanup could look like: https://github.com/os-autoinst/os-autoinst/pull/2453
(No optimizations like avoiding the cleanup altogether when the disk usage is below a certain threshold has been implemented.)
Updated by okurz 11 months ago
- Copied to action #154783: [spike][timeboxed:10h] Run os-autoinst-distri-example directly from git and ensure candidate needles show up on the web UI size:S added
Updated by mkittler 11 months ago · Edited
Plan for this ticket:
Do further testing of https://github.com/os-autoinst/os-autoinst/pull/2453 locally (so far I only ran unit tests).Wait for feedback on https://github.com/os-autoinst/os-autoinst/pull/2453 and possibly implement requested changes.Wait until https://github.com/os-autoinst/os-autoinst/pull/2453 is deployed.Enable Git caching on all o3 workers (see #154156#note-11).- Do another round of testing and wait at least 2 days to see whether this didn't break anything.
Caching of needles on the web UI side is a whole different story which makes no sense to fit into this timeboxed ticket (especially as it doesn't really match this ticket's title and AC). For that we also already have a pending PR (https://github.com/os-autoinst/openQA/pull/5175).
Updated by mkittler 11 months ago
- Related to action #155104: sh: /usr/bin/du: Permission denied on openqaworker21 added
Updated by mkittler 11 months ago
PR for the permission error: https://github.com/os-autoinst/openQA/pull/5458
Updated by mkittler 11 months ago · Edited
- Status changed from Feedback to In Progress
I rebooted openqaworker21 again, enabled the caching and cloned a test job:
openqa-clone-job --skip-chained-deps --within-instance https://openqa.opensuse.org/tests/3922612 _GROUP=0 {TEST,BUILD}+=-test-for-poo-154156 WORKER_CLASS+=,openqaworker21 CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-opensuse.git
1 job has been created:
- microos-Tumbleweed-DVD-x86_64-Build20240207-microos@64bit -> https://openqa.opensuse.org/tests/3923782
EDIT: It works, I'm going to reboot the other machines and enable caching there as well.
EDIT: I now enabled it on all workers via (for all sets of hosts):
for i in $hosts; do echo $i && ssh root@$i " mkdir -p /var/lib/openqa/cache/git && chown _openqa-worker:nogroup /var/lib/openqa/cache/git && bash -c \"grep -q GIT_CACHE_DIR /etc/openqa/workers.ini || sed -i '/CACHEDIRECTORY/a GIT_CACHE_DIR = /var/lib/openqa/cache/git' /etc/openqa/workers.ini ; grep -q GIT_CACHE_DIR_LIMIT /etc/openqa/workers.ini || sed -i '/GIT_CACHE_DIR/a GIT_CACHE_DIR_LIMIT = 10737418240' /etc/openqa/workers.ini\" " ; done
The only worker I skipped was openqaworker27 (no openQA worker setup).