Project

General

Profile

Actions

action #127622

closed

[openQA-in-openQA] openqa_install+publish takes very long - Use example distribution instead of os-autoinst-distri-opensuse size:M

Added by tinita 5 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

See also #127412

In fetchneedles the git clone sometimes fails, presumably this is more likely to happen because of the big needles repository.

The test in openqa_install+publish installs openQA, clones os-autoinst-distri-opensuse & needles and runs a test. At the end a qcow image is published. The cloning takes several minutes, and the publishing of the big image takes even longer. The whole test can take 30-60 minutes.

It's of course fine to do that in general (if the published image is even used; we are not sure about that), but the openqa_install+publish is there to test if openQA works fine. We run it every hour, and I don't see why we need a new published image every hour.

Acceptance criteria

  • AC1: The test takes less than 30 minutes

Suggestions

  • Use a much smaller test repo, maybe using the new scenario cloning feature along with the example distribution, and create a new test for the image publishing that only runs once a day or so.
  • Have a setting like FULL_OPENSUSE_TEST to retain the publishing of the existing image

Related issues 1 (0 open1 closed)

Copied from openQA Project - action #127412: [openQA-in-openQA] test fails in test_distribution size:MResolvedtinita2023-04-10

Actions
Actions #1

Updated by tinita 5 months ago

  • Copied from action #127412: [openQA-in-openQA] test fails in test_distribution size:M added
Actions #2

Updated by tinita 5 months ago

  • Subject changed from [openQA-in-openQA] openqa_install+publish - Use example distribution instead of os-autoinst-distri-opensuse to [openQA-in-openQA] openqa_install+publish takes very long - Use example distribution instead of os-autoinst-distri-opensuse
Actions #3

Updated by tinita 5 months ago

  • Description updated (diff)
Actions #4

Updated by okurz 5 months ago

  • Target version set to Ready
Actions #5

Updated by livdywan 5 months ago

  • Subject changed from [openQA-in-openQA] openqa_install+publish takes very long - Use example distribution instead of os-autoinst-distri-opensuse to [openQA-in-openQA] openqa_install+publish takes very long - Use example distribution instead of os-autoinst-distri-opensuse size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #6

Updated by tinita 5 months ago

  • Description updated (diff)
Actions #7

Updated by osukup 4 months ago

  • Assignee set to osukup
Actions #8

Updated by osukup 4 months ago

  • Status changed from Workable to In Progress

looked into code, and using os-autoinst-distri-example isn't best idea

  • needs bigger modification test
  • there isn't running distri example in o3 to clone and run test in openQA

The main problem is too big resulting image -> and we shutdown and build the image without any cleanup -> so we probably can meet AC by simply graceful killing runint test in tested instance and cleanup caches/ images, and results in SUT before stopping test.

Actions #9

Updated by okurz 4 months ago

osukup wrote:

so we probably can meet AC by simply graceful killing runint test in tested instance and cleanup caches/ images, and results in SUT before stopping test.

good point. I suggest you look into the system before shutdown to find out what are actually the biggest size contributors. Regarding change to os-autoinst-distri-example I suggest put this task on hold for us to reconsider.

Actions #10

Updated by osukup 4 months ago

After testing in my instance --> cleanup after the test ( delete test assets, downloaded image, etc, clean zypper cache) whole difference in the size of published qcow2 -> 300MB which is pretty disappointing.
So return to the original proposal and use distri-example for test in openQA_in_openQA and FULL_OPENSUSE_TEST variable to control which distri is used.

result: run of the whole test including compressing and uploading imgage on my workstation is under 15 minutes and image size got from 5.3GB to 3.1GB.

https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/122

Actions #11

Updated by osukup 4 months ago

  • Status changed from In Progress to Resolved
Actions #12

Updated by okurz 4 months ago

  • Status changed from Resolved to Feedback

We still want to run the "full" suite sometimes. Like a nightly build?

Actions #13

Updated by osukup 4 months ago

  • Status changed from Feedback to Resolved
Actions #14

Updated by osukup 4 months ago

  • Status changed from Resolved to Feedback
Actions #15

Updated by osukup 4 months ago

okurz wrote:

We still want to run the "full" suite sometimes. Like a nightly build?

then we need it scheduled by another mechanics/job in jenkins ?

Actions #16

Updated by okurz 4 months ago

Yes, or GitHub action

Actions #18

Updated by okurz 4 months ago

  • Priority changed from Low to Urgent

Yes, related. Bumping prio until the test failure is addressed.

Actions #19

Updated by osukup 3 months ago

https://openqa.opensuse.org/tests/3350077#step/test_running/12 -- looks like a problem on GitHub side, happened only in one test, and schedule command is from the example distri +- 1:1

now - full test is scheduled every day by Jenkins ( using trigger-openqa_in_openqa script from os-autoinst/scripts)

Actions #20

Updated by osukup 3 months ago

  • Status changed from Feedback to Resolved

standard jobs are against distri-example + 1 install_openqa+publish job is scheduled daily

Actions #21

Updated by tinita 3 months ago

osukup wrote:

https://openqa.opensuse.org/tests/3350077#step/test_running/12 -- looks like a problem on GitHub side, happened only in one test, and schedule command is from the example distri +- 1:1

I'm still curious how that is working: NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles
If I call that command

openqa-cli schedule --param-file SCENARIO_DEFINITIONS_YAML=scenario-definitions.yaml DISTRI=example VERSION=0 FLAVOR=DVD ARCH=x86_64 TEST=simple_boot _GROUP_ID=0 BUILD=test CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-example.git NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles

on my local instance, it fails with the same error as the mentioned test. It tries to literally clone https://github.com/os-autoinst/os-autoinst-distri-example.git/needles/ which of course does not work.
But somehow it is now working on o3. What is different?

Actions #22

Updated by okurz 3 months ago

  • Status changed from Resolved to Feedback

Let's clarify the above open point(s)

Actions #23

Updated by osukup 3 months ago

tinita wrote:

osukup wrote:

https://openqa.opensuse.org/tests/3350077#step/test_running/12 -- looks like a problem on GitHub side, happened only in one test, and schedule command is from the example distri +- 1:1

I'm still curious how that is working: NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles
If I call that command

openqa-cli schedule --param-file SCENARIO_DEFINITIONS_YAML=scenario-definitions.yaml DISTRI=example VERSION=0 FLAVOR=DVD ARCH=x86_64 TEST=simple_boot _GROUP_ID=0 BUILD=test CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-example.git NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles

on my local instance, it fails with the same error as the mentioned test. It tries to literally clone https://github.com/os-autoinst/os-autoinst-distri-example.git/needles/ which of course does not work.
But somehow it is now working on o3. What is different?

@mkittler ? I copied this command from distri-example ci, it worked on my instance. How it works internally? idk

Actions #24

Updated by osukup 3 months ago

@tinita

openqa-cli schedule --param-file SCENARIO_DEFINITIONS_YAML=scenario-definitions.yaml DISTRI=example VERSION=0 FLAVOR=DVD ARCH=x86_64 TEST=simple_boot _GROUP_ID=0 BUILD=test CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-example.git NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles

throws error

but

openqa-cli schedule --param-file SCENARIO_DEFINITIONS_YAML=scenario-definitions.yaml DISTRI=example VERSION=0 FLAVOR=DVD ARCH=x86_64 TEST=simple_boot _GROUP_ID=0 BUILD=test CASEDIR=https://github.com/os-autoinst/os-autoinst-distri-example.git NEEDLES_DIR=%%CASEDIR%%/needles

works like charm

Actions #25

Updated by tinita 3 months ago

@osukup ok, that might be, but in the test we are still using NEEDLES_DIR=https://github.com/os-autoinst/os-autoinst-distri-example.git/needles, see this screenshot:
https://openqa.opensuse.org/tests/3365692#step/start_test/5

So how is that working?

Actions #26

Updated by osukup 3 months ago

https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/126

looks like the test simply in most cases catches running when runs test prepare phase...

Actions #27

Updated by osukup 3 months ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF