action #92125: Move "MR" on submission tests into a separate job group - QA (public) - openSUSE Project Management Tool

Actions

Copy link

action #92125

closed

Move "MR" on submission tests into a separate job group

Added by okurz about 4 years ago. Updated over 3 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

osukup

Target version:

openQA Project (public) - Ready

Start date:

2021-05-04

Due date:

2021-07-30

% Done:

Estimated time:

Description

Motivation¶

Discussed during meeting about "Shift Left": Currently MR "on submission" tests are scheduled as part of already existing incident tests. The focus for "on submission" tests is on selecting super-stable tests. For this we need to be able to select individual job scenarios and also exclude job modules, e.g. using EXCLUDE_MODULES. As first example scenario "mau-sles-robot-fw" was mentioned.

Acceptance criteria¶

AC1: MR on submission tests are scheduled within separate job groups with their own schedule

Suggestions¶

Based on https://gitlab.suse.de/qa-maintenance/openQABot/-/merge_requests/56 do changes to schedule MR tests with just a single test scenario "sles4sap_robot_fw"

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by okurz about 4 years ago

Project changed from 46 to QA (public)

Actions

Copy link

Updated by okurz about 4 years ago

Ondřej Súkup can you help me with this ticket here? Did you implement the triggering of MR tests within the openQA maintenance bot?

Actions

Copy link

Updated by osukup about 4 years ago

yes , Implemented this to openqabot ... , I thought focus was to run Incidents tests before is MR accepted to catch incident issues early as possible

Actions

Copy link

Updated by osukup about 4 years ago

It possible to create own groups for MR incidents, but this mainly increase maintenance overhead for a test groups and bot data ( double +- same configs for openQA and bot) with only small added value in separated builds and better accountability to used resources of openQA -->

Actions

Copy link

Updated by okurz about 4 years ago

osukup wrote:

It possible to create own groups for MR incidents, but this mainly increase maintenance overhead for a test groups and bot data ( double +- same configs for openQA and bot) with only small added value in separated builds and better accountability to used resources of openQA -->

Exactly. I had the same concern. The idea here is only a very stable and fast subset of scenarios is used for "on submission" tests and that this configuration is maintained by "Maintenance", not by "QE". Could you please provide a bit more information:

how to change the bot to trigger a different test schedule for MR than for incident tests?
how to configure the bot to trigger according tests? Which "openQA medium" does it trigger for MR tests?

For further reference, the confluence page for "Shift Left" and particular "on submission" testing is available on https://confluence.suse.com/pages/viewpage.action?pageId=723878219

Actions

Copy link

Updated by osukup about 4 years ago

okurz wrote:

osukup wrote:

It possible to create own groups for MR incidents, but this mainly increase maintenance overhead for a test groups and bot data ( double +- same configs for openQA and bot) with only small added value in separated builds and better accountability to used resources of openQA -->

Exactly. I had the same concern. The idea here is only a very stable and fast subset of scenarios is used for "on submission" tests and that this configuration is maintained by "Maintenance", not by "QE". Could you please provide a bit more information:

how to change the bot to trigger a different test schedule for MR than for incident tests?
new config for bot +- same as for classic incidents jobs with new specialized flavours

--> + it needs configure new FLAVORs in openQA, new job groubs and of course someone who maintain this/ keep in sync with other

how to configure the bot to trigger according tests? Which "openQA medium" does it trigger for MR tests?
there is problem - we have fewer info about MR jobs than standard incidents -> so we cant trigger jobs on included binaries

For further reference, the confluence page for "Shift Left" and particular "on submission" testing is available on https://confluence.suse.com/pages/viewpage.action?pageId=723878219

Actions

Copy link

Updated by okurz about 4 years ago

https://gitlab.suse.de/qa-maintenance/openQABot/-/merge_requests/56 is the original MR that brought in changes for MRs.

"flavor" is important for https://gitlab.suse.de/qa-maintenance/openQABot/-/merge_requests/56/diffs#5e51e5be70701a2e1c4ddcb96edd12c9dd8589c5_174_196 , right? I assume this is configured on qam2.suse.de in /etc/openqa/bot.yml . The file is deployed by ansible from the "qam-metadata-openqabot" package, living in https://gitlab.suse.de/qa-maintenance/metadata/-/blob/master/bot/bot.yml . From git to IBS to host over ansible.

https://gitlab.suse.de/qa-maintenance/openQABot/-/blob/master/systemd/openqabot-mr.timer shows how maintenance requests are scheduled every hour

Actions

Copy link

Updated by okurz about 4 years ago

Subject changed from feasibility to move "MR" on submission tests into a separate job group to Mmove "MR" on submission tests into a separate job group
Description updated (diff)
Status changed from New to Workable

Actions

Copy link

Updated by okurz about 4 years ago

Subject changed from Mmove "MR" on submission tests into a separate job group to Move "MR" on submission tests into a separate job group

Actions

Copy link

#11

Updated by osukup almost 4 years ago

From my point of view, this change only adds an unnecessary burden. With very small benefits.

Cons:

It will practically duplicate QEM Incidents jobs
It will be unclear who will maintain this and keep in sync with Incidents Jobs
Any new change or fix in QEM Incidents jobs will need to be duplicated in MR groups
Data for schedule jobs will be the same, the only thing which will be changed -> FLAVOR
Same for OSD, a big bunch of new mediums with difference only in FLAVOR

Benefit:

separated MAintenance and QEM jobs ( now is a difference in BUILD )
--> better accountability

But, we can separate jobs simply in view if we add the possibility to filter jobs with patterns/regexps for variables or BUILD value in the openQA's test overview.
And +- same thing for accounting - we can differ resource based on BUILD

BUILD in incident jobs is constructed with simple schema:

standard job - :INC_NR:PKG_NAME
maintenance request - MR:REQ_NR:PKG_NAME

plus L3 runs POC with openQA ( now only kernel)
and for kernel, we have also KOTD jobs which have BUILD=KERNEL_VERSION

Actions

Copy link

#12

Updated by okurz almost 4 years ago

Status changed from Workable to Blocked
Assignee set to okurz

Thank you for your nice evaluation. I agree with your assessment which is why also in #91082#note-5 I have suggested to go ahead with the current test structure as is. Let's wait for feedback on that.

Setting blocked on #91082

Actions

Copy link

#13

Updated by okurz almost 4 years ago

Status changed from Blocked to Workable
Assignee deleted (~~okurz~~)

#91082#note-6 explains again that the preference is to have that separate job group with its own schedule.

Actions

Copy link

#14

Updated by okurz almost 4 years ago

The topic was brought up again by hrommel1+cyberiad and it was clarified that this request is also more important than "multiple package version in incident" tickets.

Out of scope: Git versioning for the separate job group schedule so I expect a single job group with a single job template of "sles4sap_robot_fw".

Actions

Copy link

#15

Updated by okurz almost 4 years ago

Status changed from Workable to In Progress
Assignee set to okurz

I created https://gitlab.suse.de/qa-maintenance/qam-openqa-yml/-/merge_requests/144 with osukup to create the schedule. osukup will do according bot code changes.

Actions

Copy link

#16

Updated by openqa_review almost 4 years ago

Due date set to 2021-07-07

Setting due date based on mean cycle time of SUSE QE Tools

Actions

Copy link

#17

Updated by okurz almost 4 years ago

https://gitlab.suse.de/qa-maintenance/qam-openqa-yml/-/merge_requests/144 approved, not yet merged.

Actions

Copy link

#18

Updated by okurz almost 4 years ago

Assignee changed from okurz to osukup

@osukup what's necessary for https://gitlab.suse.de/qa-maintenance/qam-openqa-yml/-/merge_requests/144 to be merged? and how are you doing regarding getting the bot aligned for on submission tests?

Actions

Copy link

#19

Updated by osukup almost 4 years ago

okurz wrote:

@osukup what's necessary for https://gitlab.suse.de/qa-maintenance/qam-openqa-yml/-/merge_requests/144 to be merged? and how are you doing regarding getting the bot aligned for on submission tests?

I merged it, although I'm not directly involved in qam-openqa-yml.

Bot is now sheduling 15-SP3 Incidents-MR for 'On Submission'

Actions

Copy link

#20

Updated by okurz almost 4 years ago

can you please reference according bot code changes?

Actions

Copy link

#21

Updated by okurz almost 4 years ago

Also there are tests like https://openqa.suse.de/tests/6328338 still triggered in the other job groups while we should only trigger within the "on submission" job group.

Actions

Copy link

#22

Updated by okurz almost 4 years ago

I did https://gitlab.suse.de/qa-maintenance/qam-openqa-yml/-/merge_requests/151 to fix the dependencies for the test suites. Tests are in place, separate job group, separate schedule, maintained in git repo. I wanted to verify that the two existing scenarios are fine but that is currently blocked by the network problems within SUSE R&D. The problem is within https://gitlab.suse.de/qa-maintenance/qam-openqa-yml/-/jobs/477393#L28 because gitlab CI runners can't access gitlab.suse.de. That's a problem that had been reported to EngInfra already.

EDIT: I could successfully retrigger so https://gitlab.suse.de/qa-maintenance/qam-openqa-yml/-/jobs/477873 has now succeeded
now we can wait for https://openqa.suse.de/tests/6355977 as one example to show if the test works for a MR

EDIT: The above was cancelled (reason unknown) but https://openqa.suse.de/tests/6355975 is passed

Actions

Copy link

#23

Updated by osukup almost 4 years ago

solution with '' instead of version doesn't really work --> back to trees and copy/paste a big bunch of products templates

*) -> bot schedules for this test group 15-SP2 and 15-SP3 ... but only last scheduled is run

Actions

Copy link

#24

Updated by okurz almost 4 years ago

What do you mean with "version" here? And what trees?

EDIT: ok, I understood the part about the version now. But I don't see a need to do any copy-pasting here. Either we don't need to schedule more than one version or we extend openQA to provide what we need from it

Also, still, can you please reference according bot code changes that you did? Simply a link to a merge request or commits

Also there are tests like https://openqa.suse.de/tests/overview?distri=sle&version=15-SP2&build=MR%3A244015%3Aclamav&groupid=306 still triggered in the incident job groups while we should only trigger within the "on submission" job group. Can you please comment on that?

Actions

Copy link

#25

Updated by okurz almost 4 years ago

@osukup why have you done https://gitlab.suse.de/qa-maintenance/metadata/-/merge_requests/491 ? I have explained to you explicitly that I don't see the need to schedule any more tests for MR for now as long as we don't have the feedback from maintenance that this is what they need. And please give others a chance for merge request review. I suggest you revert the MR before introducing this big blob of hard to maintain text duplication.

Actions

Copy link

#26

Updated by okurz almost 4 years ago

Status changed from In Progress to Feedback
Assignee changed from osukup to okurz

The problem has been realized by jmichel in https://chat.suse.de/channel/qa-sap-ha?msg=AAWzYCkioikz5rLgQ as well stating that the test schedule should be reduced again. I couldn't discuss the problem with you over other channels hence I now created a revert with https://gitlab.suse.de/qa-maintenance/qam-openqa-yml/-/merge_requests/156

Let me handle the next steps and crosscheck with requesters what to do next.

Actions

Copy link

#27

Updated by okurz almost 4 years ago

Someone also created multiple job groups https://openqa.suse.de/group_overview/390 https://openqa.suse.de/group_overview/389 https://openqa.suse.de/group_overview/388 https://openqa.suse.de/group_overview/387 . I don't think we should have multiple job groups and not individual ones per version.

Actions

Copy link

#28

Updated by okurz almost 4 years ago

Related to action #95075: Find jobs matching search parameters over /api/v1/jobs (especially documentation) size:S added

Actions

Copy link

#29

Updated by okurz almost 4 years ago

Assignee changed from okurz to osukup

next coordination meeting conducted, see notes in https://confluence.suse.com/pages/viewpage.action?pageId=723878219&focusedCommentId=778469603#comment-778469603

Delete version specific job groups
Check why there are no tests scheduled since 5 days in "On Submission" test
Delete schedule for MR tests in single incident groups
Adapt openQABot bot to feed back results from "on submission" job group, not MR tests within single incident group

@osukup which parts can you take over?

Actions

Copy link

#30

Updated by okurz almost 4 years ago

Due date changed from 2021-07-07 to 2021-07-30
Priority changed from High to Normal

As it shows we rely heavily on the previous knowledge of individuals hence we effectively can not treat this with high prio. Due to unforeseen absence bumping the due-date to a much longer time in the future for grace-time.

Actions

Copy link

#31

Updated by ilausuch almost 4 years ago

@osukup, Do you need help on this ticket? Maybe someone in the team could help you,

Actions

Copy link

#32

Updated by okurz almost 4 years ago

I just found out about https://gitlab.suse.de/qa-maintenance/openQABot/-/blob/master/systemd/openqabot-mrsep.timer which I was not aware about in before. If this is the service that triggers "On Submission" tests in service specific job groups then we do not need them at all and this service should be disabled.

Actions

Copy link

#33

Updated by osukup almost 4 years ago

Delete version specific job groups
done

Check why there are no tests scheduled since 5 days in "On Submission" test
because if is defined flavor with same name but correct version has higher priority than with *

Delete schedule for MR tests in single incident groups
done

Adapt openQABot bot to feed back results from "on submission" job group, not MR tests within single incident
done

Actions

Copy link

#34

Updated by osukup almost 4 years ago

and from first results .. as excepted only last sheduled version is started, and of course SLE12SP* jobs will be failed

Actions

Copy link

#35

Updated by livdywan almost 4 years ago

osukup wrote:

and from first results .. as excepted only last sheduled version is started, and of course SLE12SP* jobs will be failed

Are you going to work on the 4 items above? In that case please update the due date and make this "in progress". And if possible some outlook of the steps required, so others can help out and validate if things work as expected

Actions

Copy link

#36

Updated by osukup almost 4 years ago

cdywan wrote:

osukup wrote:

and from first results .. as excepted only last sheduled version is started, and of course SLE12SP* jobs will be failed

Are you going to work on the 4 items above? In that case please update the due date and make this "in progress". And if possible some outlook of the steps required, so others can help out and validate if things work as expected

4 items above .. all marked done, but I don't think current state is what is desired or a valid solution .. ( but on the brighter side, load on openqa will be significantly lower)

Actions

Copy link

#37

Updated by okurz almost 4 years ago

osukup wrote:

Check why there are no tests scheduled since 5 days in "On Submission" test

because if is defined flavor with same name but correct version has higher priority than with *

so deleting the version specific product definitions in the schedule again should fix it I assume. This can be an improvement point for the future to simplify scheduling from openQA side without needing clunky workarounds from test schedule maintainers

EDIT: I provided an update about the current status in https://chat.suse.de/group/initialmr?msg=kRSgNNFwD3FdPmfjz

Triggering fixed. Now again only tests within the job group "On Submission" are triggered and no other tests in any other job group: https://openqa.suse.de/parent_group_overview/36#grouped_by_build . https://openqa.suse.de/tests/overview?distri=sle&version=15-SP3&build=MR:247582:aspell&groupid=385 is an example of 2 passed jobs. https://openqa.suse.de/tests/6639169#step/accept_license/2 is a new failure, reason unknown

Actions

Copy link

#38

Updated by livdywan almost 4 years ago

Due date changed from 2021-07-30 to 2021-08-06

EDIT: I provided an update about the current status in https://chat.suse.de/group/initialmr?msg=kRSgNNFwD3FdPmfjz

I don't know what's mentioned there so I will read this as: status is still being discussed and at the least takes til the end of the week.

Actions

Copy link

#39

Updated by okurz almost 4 years ago

Due date changed from 2021-08-06 to 2021-07-30

cdywan wrote:

EDIT: I provided an update about the current status in https://chat.suse.de/group/initialmr?msg=kRSgNNFwD3FdPmfjz

I don't know what's mentioned there so I will read this as: status is still being discussed and at the least takes til the end of the week.

No. I only provided what is mentioned in #92125#note-37 already. Further improvements how to handle a similar situation in the future have already been discussed outside this ticket. If osukup does not see further tasks necessary to cover AC1 from #92125#Acceptance-criteria we can resolve.

Actions

Copy link

#40

Updated by osukup almost 4 years ago

yes , AC1 seem fulfilled - separate job_group and schedule

Actions

Copy link

#41

Updated by osukup almost 4 years ago

Status changed from Feedback to Resolved

Actions

Copy link

#42

Updated by hrommel1 over 3 years ago

Status changed from Resolved to In Progress

Reopened and put into progress because we still have jobs groups bound to a specific product version:

https://openqa.suse.de/group_overview/390
https://openqa.suse.de/group_overview/389
https://openqa.suse.de/group_overview/388
https://openqa.suse.de/group_overview/387

AFAIR the agreement was to remove those groups and have everything in group

https://openqa.suse.de/group_overview/385

Actions

Copy link

#43

Updated by osukup over 3 years ago

Status changed from In Progress to Resolved

hrommel1 wrote:

Reopened and put into progress because we still have jobs groups bound to a specific product version:

https://openqa.suse.de/group_overview/390
https://openqa.suse.de/group_overview/389
https://openqa.suse.de/group_overview/388
https://openqa.suse.de/group_overview/387

AFAIR the agreement was to remove those groups and have everything in group

https://openqa.suse.de/group_overview/385

everything is in this group (385) , all other groups are residue from past ...

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public)

Tags

Custom queries

action #92125

Move "MR" on submission tests into a separate job group

Motivation¶

Acceptance criteria¶

Suggestions¶

Updated by okurz about 4 years ago

Updated by okurz about 4 years ago

Updated by osukup about 4 years ago

Updated by osukup about 4 years ago

Updated by okurz about 4 years ago

Updated by osukup about 4 years ago

Updated by okurz about 4 years ago

Updated by okurz about 4 years ago

Updated by okurz about 4 years ago

Updated by osukup almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by openqa_review almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by osukup almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by osukup almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by okurz almost 4 years ago

Updated by ilausuch almost 4 years ago

Updated by okurz almost 4 years ago

Updated by osukup almost 4 years ago

Updated by osukup almost 4 years ago

Updated by livdywan almost 4 years ago

Updated by osukup almost 4 years ago

Updated by okurz almost 4 years ago

Updated by livdywan almost 4 years ago

Updated by okurz almost 4 years ago

Updated by osukup almost 4 years ago

Updated by osukup almost 4 years ago

Updated by hrommel1 over 3 years ago

Updated by osukup over 3 years ago