Project

General

Profile

Actions

action #64845

closed

[sle][migration][SLE15SP3] Enhance tests to not repeat "patch_sle" in so many scenarios

Added by okurz about 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Enhancement to existing tests
Target version:
-
Start date:
2020-03-26
Due date:
% Done:

100%

Estimated time:
40.00 h
Difficulty:

Description

Motivation

We had this idea discussed already multiple times but I could not find better ticket references. Many or all migration scenarios need as base a "fully patched" source system of an old product version to start from. In openQA we have the test module "patch_sle" to cover this. Commonly, also as described in #64824#note-5 this test module takes very long and causes a lot of traffic for the same operation over and over again. This could be changed to use common images which are updated whenever new maintenance patches are available for old products and not only when a new SLE build is in development which is independant.

Acceptance criteria

  • AC1: Migration tests ensure to still use fully patched systems but patch_sle does not run for longer than an hour (better only few minutes if any)

Further details

Also see #51350

Actions #1

Updated by okurz about 4 years ago

  • Priority changed from Normal to High

This is actually a very old request but I am sorry that I could not find a better fitting ticket that describes this. Issues like #64824#note-5 show that patch_sle not only takes very long and has a big traffic impact but also a big size impact on job results so we can only keep less job results because of very big migration test results.

Actions #2

Updated by okurz about 4 years ago

  • Subject changed from [migration] Enhance tests to not repeat "patch_sle" in so many scenarios to [sle][migration] Enhance tests to not repeat "patch_sle" in so many scenarios
Actions #3

Updated by leli about 4 years ago

  • Subject changed from [sle][migration] Enhance tests to not repeat "patch_sle" in so many scenarios to [sle][migration][backlog] Enhance tests to not repeat "patch_sle" in so many scenarios
  • Estimated time set to 20.00 h

I know this requirement, we will implement it when have time. Added to backlog.

Actions #4

Updated by maritawerner almost 4 years ago

  • Assignee set to coolgw

Wei Gao: as you are the Product Owner for SP3 I would like to assign that ticket to you. Could you please look into that request and implemented it for all affected Testcases for the SP3 migration? The problem that we are running out of space is getting critical and will block openQA more and more.
Please let me know if you have any questions.

Actions #5

Updated by coolgw almost 4 years ago

maritawerner wrote:

Wei Gao: as you are the Product Owner for SP3 I would like to assign that ticket to you. Could you please look into that request and implemented it for all affected Testcases for the SP3 migration? The problem that we are running out of space is getting critical and will block openQA more and more.
Please let me know if you have any questions.

@Marita ok. I will have a look

Actions #6

Updated by leli almost 4 years ago

  • Assignee changed from coolgw to leli

In fact, Oliver has talked a lot with me about the enhancement for the full update for existed images when I visit Nueremberg. I have thought about this but mass things occupied my time and haven't do more for it.

Currently, I think we can create a new job group and add test suites or just by YAML to do the update jobs for existing qcow2. The test suites should include corresponding qocw2, needed modules setting by SCHEDULE. We can consider how often to run it, such as once per two weeks. Some things need to do is how to deal with failed jobs because openQA will remove images very soon, so this may need re-run manually; to move these qcow2 to fixed folder.

Actions #7

Updated by leli almost 4 years ago

@Oliver, Hi, I will begin the investigation for the ticket. Firstly I will run a case to update a qcow2. I think there are several things not very clear for me, could you help to clarify?

Test steps:

  1. Backup the qcow2 to avoid bad image overwrite the old qcow2.
  2. Create job group and create test suite with SCHEDULE=needed module list
  3. Sanity check for these created qcow2(How? Run cases may waste time, created images will be removed soon)
  4. Move these qcow2 files from 'HDD' to 'fixed' folder of OSD server.

questions:

  1. How often trigger it? I think one per one/two weeks is ok, we don't need trigger it for each maintenance update released, regular trigger is enough for us.
  2. How to move it to fixed folder of OSD server? Now I can do this manually but need access OSD server from my laptop for the ssh key binding. If we can do this with a public account, then can create a script to do it.
Actions #8

Updated by leli almost 4 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 10
  • Estimated time changed from 20.00 h to 40.00 h

I found the test suite should use INCLUDE_MODULES=''.

Wait the job for update:
https://openqa.nue.suse.com/tests/overview?distri=sle&version=15-SP2&build=207.1&groupid=316

Actions #9

Updated by leli almost 4 years ago

Updated test suite since INCLUDE_MODULES need add these modules for publish qcow, this looks like a white list for INCLUDE_MODULES.

Wait for https://openqa.nue.suse.com/tests/4335208#

Actions #10

Updated by leli almost 4 years ago

After update the qcow2, http://openqa.nue.suse.com/tests/4338948#details patch_sle 11m29s ; Before update qcow2, https://openqa.nue.suse.com/tests/4328509# 24m42s
So update qcow2 save time for patch_sle is 13m13s, just for this case this time.

Actions #11

Updated by coolgw almost 4 years ago

  • Subject changed from [sle][migration][backlog] Enhance tests to not repeat "patch_sle" in so many scenarios to [sle][migration][SLE15SP3] Enhance tests to not repeat "patch_sle" in so many scenarios
Actions #12

Updated by coolgw almost 4 years ago

@Oliver base on following issue, i think we still need test case to cover GM + fullpatch scenario.
1165915 Update to ca-certificates-mozilla-2.40 on SLE-12 invalidates certificate on updates.suse.com

Actions #13

Updated by leli almost 4 years ago

@Oliver, I have finished part of the cases to full update existing qcow2 files, could you please have a look? The job group is https://openqa.nue.suse.com/tests/overview?distri=sle&version=15-SP2&build=209.2&groupid=316 . Please take this as an example: https://openqa.nue.suse.com/tests/4445220

Please refer comments#7 as test steps, and please confirm my two questions:

1.How often trigger it? I think one times per one/two months is ok, we don't need trigger it for each maintenance update released, regular trigger is easy and enough for us.

  1. How to move it to fixed folder of OSD server? Now I can do this manually but need access OSD server from my laptop for the ssh key binding. If we can do this with a public account, then create a script to do it is better.
Actions #14

Updated by okurz almost 4 years ago

questions:

  1. How often trigger it? I think one per one/two weeks is ok, we don't need trigger it for each maintenance update released, regular trigger is enough for us.

Yes, once a week should be fine as you should still install any additional, pending patches in the actual migration jobs just to be sure all recent patches are installed. Depending on our observations we can adjust this to run either more or less often. We should just find an approach that is configurable and flexible.
But additionally: At best we can find an event-based trigger, not a time-based trigger. So I am thinking something like "whenever enough maintenance updates are pending". Originally I thought we can just use the "Updates" maintenance tests as they already provide installations with all accumulated maintenance updates applied. However since then Maintenance has changed the approach to use an explicit set of selected maintenance updates to install so the repository setup is more complex than what we want in the migration tests. Maybe there is some other notification that informs about pending updates. For now I would stick with time-based triggers. As this means you need to run an actual trigger job somewhere I suggest for example travis CI cron-like jobs on https://github.com/os-autoinst/os-autoinst-distri-opensuse/ for openSUSE and a repository on https://gitlab.nue.suse.com for SLE.

  1. How to move it to fixed folder of OSD server? Now I can do this manually but need access OSD server from my laptop for the ssh key binding. If we can do this with a public account, then can create a script to do it.

If you need an ssh key authorized on osd please create a merge request for https://gitlab.nue.suse.com/openqa/salt-pillars-openqa/-/blob/master/sshd/users.sls . But at best I would avoid any automatically updated images in the fixed folder. Especially as the according image should be reproducable with an automatic openQA job itself. Storing images in the non-fixed folders should be no problem as according files by default are only deleted after some days and as soon as any job linked to a job group references these image files they are accounted for the corresponding job groups and kept around according to the asset quota settings as configured in the job group depending on available space.

Actions #15

Updated by leli almost 4 years ago

okurz wrote:

questions:

  1. How often trigger it? I think one per one/two weeks is ok, we don't need trigger it for each maintenance update released, regular trigger is enough for us.

Yes, once a week should be fine as you should still install any additional, pending patches in the actual migration jobs just to be sure all recent patches are installed. Depending on our observations we can adjust this to run either more or less often. We should just find an approach that is configurable and flexible.
But additionally: At best we can find an event-based trigger, not a time-based trigger. So I am thinking something like "whenever enough maintenance updates are pending". Originally I thought we can just use the "Updates" maintenance tests as they already provide installations with all accumulated maintenance updates applied. However since then Maintenance has changed the approach to use an explicit set of selected maintenance updates to install so the repository setup is more complex than what we want in the migration tests. Maybe there is some other notification that informs about pending updates. For now I would stick with time-based triggers. As this means you need to run an actual trigger job somewhere I suggest for example travis CI cron-like jobs on https://github.com/os-autoinst/os-autoinst-distri-opensuse/ for openSUSE and a repository on https://gitlab.nue.suse.com for SLE.

Thanks. Could I just create cron job on my laptop or workstation in office?

  1. How to move it to fixed folder of OSD server? Now I can do this manually but need access OSD server from my laptop for the ssh key binding. If we can do this with a public account, then can create a script to do it.

If you need an ssh key authorized on osd please create a merge request for https://gitlab.nue.suse.com/openqa/salt-pillars-openqa/-/blob/master/sshd/users.sls . But at best I would avoid any automatically updated images in the fixed folder. Especially as the according image should be reproducable with an automatic openQA job itself. Storing images in the non-fixed folders should be no problem as according files by default are only deleted after some days and as soon as any job linked to a job group references these image files they are accounted for the corresponding job groups and kept around according to the asset quota settings as configured in the job group depending on available space.
I think I haven't describe clear here. My job to update qcow2 then create a qcow2 but this new qcow2 can't be the same name with the original one, so the published qcow2 will with a new name and need to rename it to original name, so I think we can just move them to fixed folder with original qcow2 name and we can backup them if needed.

Actions #16

Updated by okurz almost 4 years ago

leli wrote:

Thanks. Could I just create cron job on my laptop or workstation in office?

Sure you could but this would be the least maintainable as there is no easy way for others to support if this breaks, no monitoring and hard to document. Using a non-personal instance is more sustainable.

[…] I think I haven't describe clear here. My job to update qcow2 then create a qcow2 but this new qcow2 can't be the same name with the original one, so the published qcow2 will with a new name and need to rename it to original name, so I think we can just move them to fixed folder with original qcow2 name and we can backup them if needed.

I have understood this and I still think it is not needed to move anything to the fixed folder. It is a good idea to have a backup of images but I don't understand, why is it not possible to update the same name filename? Also maybe you can try to just solve that with multiple openQA jobs, e.g. one to patch the qcow and one to rename it back to the general name. If all of this is not easily possible within openQA then I suggest "CI jobs" to do that, e.g. within gitlab CI or travis CI as previously mentioned for the trigger already.

Actions #17

Updated by coolgw almost 4 years ago

@Oliver
Question for how to use travis CI, my understanding travis CI only can be triggered by a new commit, how to let it regular trigger? And could you help give us a simple example to execute the script every two weeks?

Actions #18

Updated by okurz almost 4 years ago

As discussed in chat: Our existing travis CI definitions are available in https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/.travis.yml . https://docs.travis-ci.com/user/cron-jobs/ is the documentation about "cron jobs" in travis CI. https://travis-ci.org/ is the main travis CI page.

CI jobs can run for master branch. But first I recommend you try it out in a personal fork and enable travis CI for your fork. If you are happy with the changes then you can create a pull request with the according changes in .travis.yml and we can enable the according travis CI cron jobs within the main repo. I recommend you start the documentation from https://docs.travis-ci.com/ . It's really good and probably way better in answering questions than I would be able to. For sensitive data like API key and secrets probably https://docs.travis-ci.com/user/encryption-keys/ should help. Keep in mind that travis CI as a public service can be helpful for anything on openqa.opensuse.org. An alternative are "Github Actions". For SUSE-internal triggers I recommend the gitlab CI using our internal instance gitlab.nue.suse.com . Many teams use that already in conjunction with openQA, e.g. to maintain the job templates. travis.yml is only used for travis CI. For gitlab CI you need a file .gitlab-ci.yml , so similar but different 🙂 https://docs.gitlab.com/ee/ci/yaml/
docs.gitlab.com. A good example is probably https://gitlab.suse.de/qsf-y/qa-sle-functional-y from QSF-y how they load their job templates YAML documents using gitlab CI . https://gitlab.suse.de/qsf-y/qa-sle-functional-y/-/blob/master/.gitlab-ci.yml#L8 shows how they set openQA client parameters based on variables. These variables are defined within the gitlab project.

Actions #19

Updated by leli over 3 years ago

  • % Done changed from 10 to 80

With Oliver's help, I created a gitlab project to trigger job group to update migration qcow2 files. https://gitlab.suse.de/leli/scripts-for-ci/-/blob/master/.gitlab-ci.yml We can schedule the trigger on regular cycle such as bi-weekly.

We will trigger the job group to update migration qcow2 files every two weeks at first, the updated qcow2 will change name to $orig+'-updated'.qcow2, we will update the HDD_1 in our test suites to follow the new qcow name.

Talking about this in rocket chat with Oliver:
########################
7:38 PM
In fact, I mean update qcow (ex: orig.qcow2) booting+publishing to (orig-updated.qcow2), the orig.qcow2 exist in fixed folder, the orig-updated.qcow2 exist in hdd folder. We can change our test suites to use orig-updated.qcow2 and every week update orig.qcow2 to create new orig-updated.qcow2.
okurz
Oliver Kurz @okurz
7:39 PM
Yes, I understand that. I think this approach is simple enough you can try that first. Based on experiences you can improve it in a later step, e.g. only patch the diff on an older already-updated image
##########################

The job group of update migration qcow2:
https://openqa.nue.suse.com/tests/overview?distri=sle&version=15-SP2&build=209.2&groupid=316

Actions #20

Updated by leli over 3 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 80 to 100
Actions

Also available in: Atom PDF