action #107014
closedcoordination #91646: [saga][epic] SUSE Maintenance QA workflows with fully automated testing, approval and release
openQA Project - coordination #89062: [epic] Simplify review for SUSE QAM
trigger openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known size:M
0%
Description
Motivation¶
SUSE SLE maintenance aggregate tests can fail if any of the included incident update causes a problem. The challenge is to find out which of the incident updates caused it. For this okurz created https://github.com/os-autoinst/scripts/blob/master/openqa-trigger-bisect-jobs . We already have automatic investigation jobs when a failure is not known and labeled accordingly in openQA. So we should combine both and trigger "openqa-trigger-bisect-jobs" as part of the automatic investigation as well.
Acceptance criteria¶
- AC1: For SUSE SLE maintenance aggregate test failures,
openqa-trigger-bisect-jobs
is run additionally toopenqa-investigate
, creating automatic investigation jobs - AC2: We see all those investigation jobs listed in comments on the failed job
- AC3: Other jobs do not trigger
openqa-trigger-bisect-jobs
(or the script aborts early without failure) - AC4: OSD is not overwhelmed with
openqa-trigger-bisect-jobs
-jobs
Suggestions¶
- Take a look into https://github.com/os-autoinst/salt-states-openqa/blob/master/openqa/server.sls#L81 how we trigger investigation jobs from job done hooks. We basically call https://github.com/os-autoinst/scripts/blob/master/openqa-label-known-issues-and-investigate-hook
- Either extend that hook or create another one
- Either create a new comment besides the one from openqa-investigate, or list all investigate-jobs in one comment
Further details¶
See https://suse.slack.com/archives/C02D16TCP99/p1645022828339319 for more context if needed
Updated by okurz over 2 years ago
- Related to coordination #94105: [epic] Use feedback from openqa-investigate to automatically inform on github pull requests, open tickets, weed out automatically failed tests added
Updated by okurz over 2 years ago
- Subject changed from trigger https://github.com/os-autoinst/scripts/blob/master/openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known to trigger https://github.com/os-autoinst/scripts/blob/master/openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by tinita over 2 years ago
- Status changed from Workable to In Progress
- Assignee set to tinita
Updated by tinita over 2 years ago
I've been getting familiar with the script, and I'm trying to make it more robust and return early instead of letting python die with a generic error message. (e.g. if there is no OS_TEST_ISSUES or the regex does not match)
Updated by openqa_review over 2 years ago
- Due date set to 2022-03-12
Setting due date based on mean cycle time of SUSE QE Tools
Updated by tinita over 2 years ago
PR: https://github.com/os-autoinst/scripts/pull/136 to make the script more robust
Updated by tinita over 2 years ago
Minimal alternative PR: https://github.com/os-autoinst/scripts/pull/137
Updated by tinita over 2 years ago
- Due date changed from 2022-03-12 to 2022-03-19
I didn't work on it because of the internal hackweek and a urgent ticket.
Updated by livdywan over 2 years ago
- Due date changed from 2022-03-19 to 2022-03-25
I'm bumping the due date to account for availability
tinita wrote:
PR: https://github.com/os-autoinst/scripts/pull/136 to make the script more robust
Maybe next week we can see if it makes sense to e.g. take tests from this PR, assuming the minimal change is otherwise enough
Updated by livdywan over 2 years ago
- Subject changed from trigger https://github.com/os-autoinst/scripts/blob/master/openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known size:M to trigger openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known size:M
- Due date changed from 2022-03-25 to 2022-04-01
Bumping the due date based on availability
Updated by tinita over 2 years ago
Python. Yay.
I spent the whole afternoon trying to find out how to:
- load a module from a filename like 'openqa-trigger-bisect-jobs' without a suffix and with dashes in the filename
(no,
openqa = __import__("openqa-trigger-bisect-jobs")
only works if the script has a.py
suffix) - I was pointed to
imp.load_source
and that worksopenqa = imp.load_source('openqa', rootpath + '/openqa-trigger-bisect-jobs')
. It took me a while because it wasn't clear what parameters it would get - then I was pointed to
importlib
asimp
is going to be deprecated - I fought with
importlib
and found no way to load a script as a module with it - so I'm going to go back to
imp
for now and let python lovers actually replace it when the time has come
btw, in perl it would be simply require 'path/to/script'; my $mock = Test::MockModule->new('main'); $mock->redefine(script_function_name => $new_coderef);
(btw, thanks to Alberto Planas Dominguez and Martin Doucha for the help)
Updated by tinita over 2 years ago
- Status changed from In Progress to Feedback
PR updated with pytest: https://github.com/os-autoinst/scripts/pull/136
Currently failing because it can't find mocker
.
Updated by tinita over 2 years ago
- Due date changed from 2022-04-01 to 2022-04-05
Bumping due date because I'll be at a workshop
Updated by tinita over 2 years ago
- Due date changed from 2022-04-05 to 2022-04-09
This is an endless task. There are things which are better in python than in perl, sure, but python makes up for this by doing other things worse :(
https://github.com/os-autoinst/scripts/runs/5817697190?check_suite_focus=true
tests/test_trigger_bisect_jobs.py:8
36
/home/runner/work/scripts/scripts/tests/test_trigger_bisect_jobs.py:8: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
37
import imp
Updated by tinita over 2 years ago
Thanks to Martin Doucha and Nick Singer and others for your help.
The replacement for imp
is importlib.machinery.SourceFileLoader
, and as @okurz requested, I also replaced the _replace
call with something else which is IMHO much less readable, but so be it.
Updated by tinita over 2 years ago
- Status changed from Feedback to In Progress
PR merged, now I have to make sure the script is called on osd
Updated by tinita over 2 years ago
PR for replacing deprecated load_module: https://github.com/os-autoinst/scripts/pull/140
Updated by tinita over 2 years ago
https://github.com/perlpunk/scripts/tree/hook-call-trigger-bisect
- added openqa-trigger-bisect-jobs to the hook script
- added a test for the hook script
Still would like to run it on a real osd job tomorrow.
Updated by tinita over 2 years ago
Updated by mkittler over 2 years ago
- Related to action #95783: Provide support for multi-machine scenarios handled by openqa-investigate size:M added
Updated by tinita over 2 years ago
PR for the chained jobs Marius mentioned: https://github.com/os-autoinst/scripts/pull/142
edit: merged 2022-04-06
Updated by tinita over 2 years ago
Updated by tinita over 2 years ago
- Description updated (diff)
- Status changed from Feedback to In Progress
Updated by livdywan over 2 years ago
- Due date changed from 2022-04-09 to 2022-04-15
I'm assuming we want to monitor this over the next few days (AC4), hence bumping the due date.
Updated by tinita over 2 years ago
https://github.com/os-autoinst/scripts/pull/143 - Write comment after creating bisect investigate jobs
See example comments here: https://openqa.opensuse.org/tests/2282708#comments
Updated by okurz over 2 years ago
- Due date changed from 2022-04-15 to 2022-04-22
- Assignee changed from tinita to okurz
Quite some jobs were already scheduled on OSD, see https://openqa.suse.de/tests/8538952 . coolo found the jobs and mentioned in https://suse.slack.com/archives/C02AJ1E568M/p1649840218112839 . Maybe the load is actually too high but I would give it more time to decide on that. I will monitor for the next days.
One thing I found is that there are also jobs triggered for single-incident tests which does not make sense as we already have the "retry" jobs.
Updated by okurz over 2 years ago
- Assignee changed from okurz to tinita
I created https://github.com/os-autoinst/scripts/pull/146, merged, to prevent investigation jobs triggered when the list of incidents is only 1 (or less entries). https://monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1 shows that we still have significantly high number of scheduled jobs. It looks to be slowly decreasing and certainly public Easter holiday will help as much. But afterwards we should identify which jobs might still be redundant or not providing enough value compared to the additional load.
Updated by livdywan over 2 years ago
- Status changed from Feedback to Resolved
Reviewed the ACs in the Unblock and we agree it's looking great