action #127622
closed[openQA-in-openQA] openqa_install+publish takes very long - Use example distribution instead of os-autoinst-distri-opensuse size:M
Description
Observation¶
See also #127412
In fetchneedles the git clone sometimes fails, presumably this is more likely to happen because of the big needles repository.
The test in openqa_install+publish installs openQA, clones os-autoinst-distri-opensuse & needles and runs a test. At the end a qcow image is published. The cloning takes several minutes, and the publishing of the big image takes even longer. The whole test can take 30-60 minutes.
It's of course fine to do that in general (if the published image is even used; we are not sure about that), but the openqa_install+publish is there to test if openQA works fine. We run it every hour, and I don't see why we need a new published image every hour.
Acceptance criteria¶
- AC1: The test takes less than 30 minutes
Suggestions¶
- Use a much smaller test repo, maybe using the new scenario cloning feature along with the example distribution, and create a new test for the image publishing that only runs once a day or so.
- Have a setting like FULL_OPENSUSE_TEST to retain the publishing of the existing image
Updated by tinita over 1 year ago
- Copied from action #127412: [openQA-in-openQA] test fails in test_distribution size:M added
Updated by tinita over 1 year ago
- Subject changed from [openQA-in-openQA] openqa_install+publish - Use example distribution instead of os-autoinst-distri-opensuse to [openQA-in-openQA] openqa_install+publish takes very long - Use example distribution instead of os-autoinst-distri-opensuse
Updated by livdywan over 1 year ago
- Subject changed from [openQA-in-openQA] openqa_install+publish takes very long - Use example distribution instead of os-autoinst-distri-opensuse to [openQA-in-openQA] openqa_install+publish takes very long - Use example distribution instead of os-autoinst-distri-opensuse size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by osukup over 1 year ago
- Status changed from Workable to In Progress
looked into code, and using os-autoinst-distri-example isn't best idea
- needs bigger modification test
- there isn't running distri
example
in o3 to clone and run test in openQA
The main problem is too big resulting image -> and we shutdown and build the image without any cleanup -> so we probably can meet AC by simply graceful killing runint test in tested instance and cleanup caches/ images, and results in SUT before stopping test.
Updated by okurz over 1 year ago
osukup wrote:
so we probably can meet AC by simply graceful killing runint test in tested instance and cleanup caches/ images, and results in SUT before stopping test.
good point. I suggest you look into the system before shutdown to find out what are actually the biggest size contributors. Regarding change to os-autoinst-distri-example I suggest put this task on hold for us to reconsider.
Updated by osukup over 1 year ago
After testing in my instance --> cleanup after the test ( delete test assets, downloaded image, etc, clean zypper cache) whole difference in the size of published qcow2 -> 300MB which is pretty disappointing.
So return to the original proposal and use distri-example for test in openQA_in_openQA and FULL_OPENSUSE_TEST
variable to control which distri is used.
result: run of the whole test including compressing and uploading imgage on my workstation is under 15 minutes and image size got from 5.3GB to 3.1GB.
https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/122
Updated by okurz over 1 year ago
- Status changed from Resolved to Feedback
We still want to run the "full" suite sometimes. Like a nightly build?
Updated by osukup over 1 year ago
okurz wrote:
We still want to run the "full" suite sometimes. Like a nightly build?
then we need it scheduled by another mechanics/job in jenkins ?
Updated by jbaier_cz over 1 year ago
Probably related: https://openqa.opensuse.org/tests/3350077#step/test_running/12
Updated by okurz over 1 year ago
- Priority changed from Low to Urgent
Yes, related. Bumping prio until the test failure is addressed.
Updated by osukup over 1 year ago
https://openqa.opensuse.org/tests/3350077#step/test_running/12 -- looks like a problem on GitHub side, happened only in one test, and schedule command is from the example distri +- 1:1
now - full test is scheduled every day by Jenkins ( using trigger-openqa_in_openqa
script from os-autoinst/scripts)
Updated by osukup over 1 year ago
- Status changed from Feedback to Resolved
standard jobs are against distri-example + 1 install_openqa+publish job is scheduled daily
Updated by tinita over 1 year ago
osukup wrote:
https://openqa.opensuse.org/tests/3350077#step/test_running/12 -- looks like a problem on GitHub side, happened only in one test, and schedule command is from the example distri +- 1:1
I'm still curious how that is working: NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles
If I call that command
openqa-cli schedule --param-file SCENARIO_DEFINITIONS_YAML=scenario-definitions.yaml DISTRI=example VERSION=0 FLAVOR=DVD ARCH=x86_64 TEST=simple_boot _GROUP_ID=0 BUILD=test CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-example.git NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles
on my local instance, it fails with the same error as the mentioned test. It tries to literally clone https://github.com/os-autoinst/os-autoinst-distri-example.git/needles/
which of course does not work.
But somehow it is now working on o3. What is different?
Updated by okurz over 1 year ago
- Status changed from Resolved to Feedback
Let's clarify the above open point(s)
Updated by osukup over 1 year ago
tinita wrote:
osukup wrote:
https://openqa.opensuse.org/tests/3350077#step/test_running/12 -- looks like a problem on GitHub side, happened only in one test, and schedule command is from the example distri +- 1:1
I'm still curious how that is working:
NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles
If I call that commandopenqa-cli schedule --param-file SCENARIO_DEFINITIONS_YAML=scenario-definitions.yaml DISTRI=example VERSION=0 FLAVOR=DVD ARCH=x86_64 TEST=simple_boot _GROUP_ID=0 BUILD=test CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-example.git NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles
on my local instance, it fails with the same error as the mentioned test. It tries to literally clone
https://github.com/os-autoinst/os-autoinst-distri-example.git/needles/
which of course does not work.
But somehow it is now working on o3. What is different?
@mkittler ? I copied this command from distri-example ci, it worked on my instance. How it works internally? idk
Updated by osukup over 1 year ago
openqa-cli schedule --param-file SCENARIO_DEFINITIONS_YAML=scenario-definitions.yaml DISTRI=example VERSION=0 FLAVOR=DVD ARCH=x86_64 TEST=simple_boot _GROUP_ID=0 BUILD=test CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-example.git NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles
throws error
but
openqa-cli schedule --param-file SCENARIO_DEFINITIONS_YAML=scenario-definitions.yaml DISTRI=example VERSION=0 FLAVOR=DVD ARCH=x86_64 TEST=simple_boot _GROUP_ID=0 BUILD=test CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-example.git NEEDLES_DIR=%%CASEDIR%%/needles
works like charm
Updated by tinita over 1 year ago
@osukup ok, that might be, but in the test we are still using NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles
, see this screenshot:
https://openqa.opensuse.org/tests/3365692#step/start_test/5
So how is that working?
Updated by osukup over 1 year ago
https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/126
looks like the test simply in most cases catches running
when runs test prepare phase...