action #107923
closedcoordination #91646: [saga][epic] SUSE Maintenance QA workflows with fully automated testing, approval and release
qem-bot: Ignore not-ok openQA jobs for specific incident based on openQA job comment size:M
Added by okurz over 2 years ago. Updated about 1 year ago.
0%
Description
Motivation¶
See the proposal in the parent epic #95479, e.g. about the specific format of an openQA label that is readable by qem-bot
Acceptance criteria¶
- AC1: A not-ok openQA job with a comment following format https://progress.opensuse.org/issues/95479#Suggestions is not blocking approval of incident updates
- AC2: A not-ok openQA job with such comment is still blocking approval of all other, not specified incident updates
- AC3: A not-ok openQA without such comment is still blocking all related incidents
Suggestions¶
- DONE: Add a testing framework to github.com/openSUSE/qem-bot/, e.g. based on github.com/os-autoinst/openqa_review -> #109641
- DONE: Add a simple automatic test exercising one of the existing happy path workflows of qem-bot -> #110167
- DONE: Add automatic tests for the above acceptance criteria
- DONE: As a first quick-and-dirty, and messy approach read out openQA comments directly within the approval step of qem-bot (only for the failed jobs which should not take too long)
- DONE: Parse the mentioned special label string and for the parsed incident remove the according not-ok openQA job from the list of blocking results
- Optional: Add openQA comment parsing over the openQA API together with consistent data in qem-dashboard, i.e.
- As qem-dashboard is the "database for qem-bot" read out the according data from openQA that is pushed to qem-dashboard
- make qem-dashboard store the related data
- and then qem-bot should read it out from there
Out of scope¶
Visualize such specially handled failed openQA jobs in dashboard
Updated by okurz over 2 years ago
- Related to openqa-force-result #109857: Secure auto-review+force_result size:M auto_review:"Failed to download gobbledeegoop":force_result:softfailed added
Updated by okurz over 2 years ago
- Related to action #111078: Simple automatic test exercising one of the existing happy path workflows of qem-bot size:M added
Updated by okurz over 2 years ago
- Status changed from New to Blocked
- Assignee set to okurz
Updated by okurz over 2 years ago
- Status changed from Blocked to New
- Assignee deleted (
okurz)
Updated by okurz over 2 years ago
- Subject changed from qem-bot: Ignore not-ok openQA jobs for specific incident based on openQA job comment to qem-bot: Ignore not-ok openQA jobs for specific incident based on openQA job comment size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by mkittler about 2 years ago
- Status changed from Workable to In Progress
- Assignee set to mkittler
Updated by openqa_review about 2 years ago
- Due date set to 2022-10-28
Setting due date based on mean cycle time of SUSE QE Tools
Updated by jbaier_cz about 2 years ago
Likely a regression: https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1189679
Updated by mkittler about 2 years ago
PR to fix pipeline failures: https://github.com/openSUSE/qem-bot/pull/75
Updated by jbaier_cz about 2 years ago
It seems, that the pipeline may be working, although we should still get rid of the stack traces. To actually see, if we already use the patched version, I propose a little update for the pipeline: https://gitlab.suse.de/qa-maintenance/bot-ng/-/merge_requests/59
Updated by mkittler about 2 years ago
Let's see whether https://github.com/openSUSE/qem-bot/pull/76 fixes the pipeline (e.g. https://gitlab.suse.de/qa-maintenance/bot-ng/-/pipelines/506277).
Updated by jbaier_cz about 2 years ago
The code still does a lot of retries on 404 from non-existent comments resulting in a very long run duration and job termination after 1h timeout.
Updated by mkittler about 2 years ago
Maybe it is the simplest to just disable the retry for now: https://github.com/openSUSE/qem-bot/pull/79
(Not sure how to make it only retry on connection errors and 500 responses which would likely be the most reasonable choice and I assumed would be the default retry behavior.)
Updated by mkittler about 2 years ago
- Status changed from In Progress to Feedback
All PRs have been merged. The pipeline works again and doesn't take that long to execute (e.g. https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1194317).
So putting a comment like https://progress.opensuse.org/issues/95479#Suggestions in an openQA job should now work.
Updated by okurz about 2 years ago
Nice. Also thank you for the announcement in #eng-testing. I suggest to await a real-life usage. You could query the openQA comments for matching comments if people used it.
Before resolving please make sure to put any not-done "optional" tasks in according new ticket(s).
Updated by tinita about 2 years ago
mkittler wrote:
Maybe it is the simplest to just disable the retry for now: https://github.com/openSUSE/qem-bot/pull/79
(Not sure how to make it only retry on connection errors and 500 responses which would likely be the most reasonable choice and I assumed would be the default retry behavior.)
This should probably be done in the openqa_client.
There's even a related issue: https://github.com/os-autoinst/openQA-python-client/issues/16
I had a quick look at the code, but currently it is using requests
, and if I understand it correctly, it would have to be rewritten with urllib3
to use the retry feature.
Updated by mkittler about 2 years ago
There are no such comments yet:
openqa=# select job_id, text from comments where text like '%review%acceptable%';
job_id | text
--------+------
(0 Zeilen)
Likely it is also possible to tweak the retry while keep using requests. (I cannot imagine that request makes this impossible. Likely they only prefer urllib3
because it has already a built-in feature for exactly what we want.)
Updated by kraih about 2 years ago
Still lots of error messages in the pipeline output https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1196209:
ERROR: ('GET', 'https://openqa.suse.de/api/v1/jobs/1944211/comments', 404)
Traceback (most recent call last):
File "/builds/qa-maintenance/bot-ng/qem-bot/openqabot/openqa.py", line 73, in get_job_comments
"GET", "jobs/%s/comments" % job_id, retries=0
File "/usr/lib/python3.6/site-packages/openqa_client/client.py", line 184, in openqa_request
return self.do_request(req, retries=retries, wait=wait)
File "/usr/lib/python3.6/site-packages/openqa_client/client.py", line 164, in do_request
raise err
File "/usr/lib/python3.6/site-packages/openqa_client/client.py", line 144, in do_request
request.method, resp.url, resp.status_code
openqa_client.exceptions.RequestError: ('GET', 'https://openqa.suse.de/api/v1/jobs/1944211/comments', 404)
Updated by mkittler about 2 years ago
Updated by mkittler about 2 years ago
okurz wrote:
- Optional: Add openQA comment parsing over the openQA API together with consistent data in qem-dashboard, i.e.
- As qem-dashboard is the "database for qem-bot" read out the according data from openQA that is pushed to qem-dashboard
- make qem-dashboard store the related data
- and then qem-bot should read it out from there
That's likely not very useful. We'd need to query all comments when syncing the dashboard so we'd likely end up with more request on the openQA side. Besides, judging the by logs of recent approval pipeline runs I suppose openQA will be able to handle the traffic (as it doesn't look too much, we also only query comments for failed jobs).
One could do a further optimization: As soon as we see a failing job blocking an incident, we'd avoid checking any further failing job for that incident (e.g. an early break in the loop introduced here: https://github.com/openSUSE/qem-bot/pull/73/files).
Updated by mkittler about 2 years ago
PR for the optimization mentioned in the previous comment: https://github.com/openSUSE/qem-bot/pull/82
Note that the feature isn't used yet (select job_id, user_id, text from comments where text like '%review%acceptable%' and user_id != 126;
returns no results).
Updated by mkittler about 2 years ago
PR to make this at least a little bit more discoverable: https://github.com/os-autoinst/openQA/pull/4867
Updated by okurz about 2 years ago
- Related to action #119467: "Internal server error" on opening any job group front page at OSD added
Updated by okurz about 2 years ago
As we learned from #119467 we should have
- No obvious related regression on OSD
- Test coverage for the branding button code exists for both the job comments as well as job group comments
so I suggest in the next try to introduce such test.
Updated by okurz about 2 years ago
- Due date changed from 2022-10-28 to 2022-11-11
We hit the mentioned regression and want to improve tests when reintroducing. Giving more time for this which should be an exception
Updated by mkittler about 2 years ago
Fixed version of the original PR with test coverage for the SUSE branding template code: https://github.com/os-autoinst/openQA/pull/4881
Updated by mkittler about 2 years ago
Looks like the feature is still not used. The PR for adding the button back has been merged.
Updated by okurz about 2 years ago
- Due date deleted (
2022-11-11) - Status changed from Feedback to Resolved
I triggered an extraordinary deployment on OSD and verified that the comment template showed up on https://openqa.suse.de/tests/latest#comments . Documentation and informing people can be done in related tickets, e.g. #111066
Updated by tinita about 2 years ago
- Status changed from Resolved to Feedback
We should be able to use retry now because 404 is now excluded: https://github.com/os-autoinst/openQA-python-client/pull/34
Updated by mkittler about 2 years ago
Ok, although I need to check first whether we already have v4.2.1 of that Python module in the CI environment.
Updated by okurz about 2 years ago
In https://suse.slack.com/archives/C02CBB35W5B/p1668779284852569?thread_ts=1668611066.502389&cid=C02CBB35W5B Veronika Svecova tried to use the special marking comment but it seems to have no effect. Do you know why incident 26757 can not be auto-approved? The only linked failed incident test according to http://dashboard.qam.suse.de/incident/26757 is https://openqa.suse.de/tests/9987265 with a comment "@review:acceptable_for:incident_26757:kgraft-patch-SLE12-SP4_Update_23:cleared_with_Marcus" but https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1246560#L171 states "INFO: Inc 26757 has failed job in incidents"
openqa-cli api --osd job_settings/jobs key=*_TEST_ISSUES list_value=26757
returns
{"jobs":[9987265,9977121]}
which seems quite broken as there are definitely more jobs than two. At least it includes the one failed job that was marked to be ignorable. The other job is ok though.
Adding some logging qem-bot states that https://openqa.suse.de/tests/1953773 is blocking approval. This job does not even exist in OSD (anymore?) and judging by the number it would years old. I assume with that even force-result won't help. Sounds like #119161
Updated by kraih about 2 years ago
okurz wrote:
openqa-cli api --osd job_settings/jobs key=*_TEST_ISSUES list_value=26757
returns{"jobs":[9987265,9977121]}
which seems quite broken as there are definitely more jobs than two. At least it includes the one failed job that was marked to be ignorable. The other job is ok though.
That would be the job_id
limit. We only list jobs that are max(job_id) - 20000
or higher for performance. (trigram index would fix that and show 257 results) #117655
Updated by kraih about 2 years ago
okurz wrote:
Adding some logging qem-bot states that https://openqa.suse.de/tests/1953773 is blocking approval. This job does not even exist in OSD (anymore?) and judging by the number it would years old. I assume with that even force-result won't help. Sounds like #119161
Here's what the dashboard knows about the incident:
dashboard_db=# select job_id, status from openqa_jobs where incident_settings = 1953780 order by job_id desc;
job_id | status
---------+--------
9962984 | passed
9962255 | passed
9962253 | passed
9936895 | passed
9936894 | passed
9936893 | passed
9936892 | passed
9936891 | passed
9936890 | passed
9936889 | passed
9936888 | passed
9936884 | passed
9936883 | passed
9936882 | passed
9936881 | passed
9936880 | passed
9936879 | passed
9936878 | passed
9936877 | passed
9936876 | passed
9936875 | passed
9936874 | passed
9936873 | passed
9936872 | passed
9936871 | passed
9936870 | passed
9936869 | passed
9936868 | passed
9936867 | passed
9936866 | passed
9936865 | passed
9936864 | passed
9936863 | passed
9936862 | passed
9936861 | passed
9936860 | passed
9936859 | passed
9936858 | passed
9936857 | passed
9936856 | passed
9936855 | passed
9936854 | passed
9936853 | passed
9936852 | passed
9936851 | passed
9936850 | passed
9936849 | passed
9936848 | passed
9936847 | passed
9936846 | passed
9936845 | passed
9936844 | passed
9936843 | passed
9936842 | passed
9936841 | passed
9936840 | passed
(56 rows)
Curious that the newest 4 jobs in OSD for the incident are unknown to the dashboard and the dashboard doesn't know about 9937076 etc.:
openqa=# select * from job_settings where key like '%TEST_ISSUES' and value like '%26757%' order by job_id desc;
id | key | value | job_id | t_created | t_updated
-----------+------------------+-------+---------+---------------------+---------------------
455803578 | LIVE_TEST_ISSUES | 26757 | 9987265 | 2022-11-18 10:59:48 | 2022-11-18 10:59:48
455315068 | LIVE_TEST_ISSUES | 26757 | 9977121 | 2022-11-16 19:56:54 | 2022-11-16 19:56:54
454525516 | LIVE_TEST_ISSUES | 26757 | 9962987 | 2022-11-15 12:55:06 | 2022-11-15 12:55:06
454525454 | LIVE_TEST_ISSUES | 26757 | 9962985 | 2022-11-15 12:54:26 | 2022-11-15 12:54:26
454525399 | LIVE_TEST_ISSUES | 26757 | 9962984 | 2022-11-15 12:54:25 | 2022-11-15 12:54:25
454494414 | LIVE_TEST_ISSUES | 26757 | 9962255 | 2022-11-15 10:52:09 | 2022-11-15 10:52:09
454494309 | LIVE_TEST_ISSUES | 26757 | 9962253 | 2022-11-15 10:51:38 | 2022-11-15 10:51:38
453330250 | LIVE_TEST_ISSUES | 26757 | 9937076 | 2022-11-11 15:16:23 | 2022-11-11 15:16:23
453330203 | LIVE_TEST_ISSUES | 26757 | 9937075 | 2022-11-11 15:16:23 | 2022-11-11 15:16:23
453330187 | LIVE_TEST_ISSUES | 26757 | 9937074 | 2022-11-11 15:16:23 | 2022-11-11 15:16:23
453330149 | LIVE_TEST_ISSUES | 26757 | 9937073 | 2022-11-11 15:16:22 | 2022-11-11 15:16:22
453330101 | LIVE_TEST_ISSUES | 26757 | 9937072 | 2022-11-11 15:16:22 | 2022-11-11 15:16:22
453330077 | LIVE_TEST_ISSUES | 26757 | 9937071 | 2022-11-11 15:16:22 | 2022-11-11 15:16:22
...
453323887 | LIVE_TEST_ISSUES | 26757 | 9936895 | 2022-11-11 15:15:09 | 2022-11-11 15:15:09
453323853 | LIVE_TEST_ISSUES | 26757 | 9936894 | 2022-11-11 15:15:09 | 2022-11-11 15:15:09
...
Note that these are not aggregate jobs, so it's unrelated to previous issues with missing data.
Updated by mkittler about 2 years ago
If the bot/dashboard don't know/consider the job where the comment is created on than this feature can obviously not work.
It would be great if the failed jobs would be logged (instead of getting just "INFO: Inc 26757 has failed job in incidents"). But before improving the logging it would be best to wait until the currently pending mega-PR (https://github.com/openSUSE/qem-bot/pull/84) has been merged. And then I can also improve the error handling as mentioned in #107923#note-35.
Updated by okurz almost 2 years ago
- Related to action #114415: [timeboxed:10h][spike solution] qem-bot comments on IBS size:S added
Updated by jbaier_cz almost 2 years ago
- Related to action #120939: [alert] Pipeline for scheduling incidents runs into timeout size:M added
Updated by livdywan almost 2 years ago
mkittler wrote:
If the bot/dashboard don't know/consider the job where the comment is created on than this feature can obviously not work.
It would be great if the failed jobs would be logged (instead of getting just "INFO: Inc 26757 has failed job in incidents"). But before improving the logging it would be best to wait until the currently pending mega-PR (https://github.com/openSUSE/qem-bot/pull/84) has been merged. And then I can also improve the error handling as mentioned in #107923#note-35.
See also https://github.com/openSUSE/qem-bot/pull/83 which was merged as part of it
Updated by jbaier_cz almost 2 years ago
- Related to action #119161: Approval step of qem-bot says incident has failed job in incidents but it looks empty on the dashboard size:M added
Updated by okurz almost 2 years ago
- Due date set to 2022-12-16
- Status changed from Feedback to In Progress
- Assignee changed from mkittler to okurz
The mentioned PR https://github.com/openSUSE/qem-bot/pull/84 has been reverted in the meantime. I see that we are not progressing here so I am taking over trying to bring in my original changes one by one. This should help with debugging from logs.
Updated by okurz almost 2 years ago
https://github.com/openSUSE/qem-bot/pull/93https://github.com/openSUSE/qem-bot/pull/94https://github.com/openSUSE/qem-bot/pull/95https://github.com/openSUSE/qem-bot/pull/96
EDIT: All merged. New PR, one step at a time:
Updated by okurz almost 2 years ago
- Due date changed from 2022-12-16 to 2022-12-23
- Status changed from In Progress to Feedback
Updated by mkittler almost 2 years ago
What would actually still be left is the improvement mentioned in #107923#note-35.
Updated by okurz almost 2 years ago
- Due date changed from 2022-12-23 to 2023-01-20
Opened a new PR https://github.com/openSUSE/qem-bot/pull/103
I would like to keep this open until more people return after Christmas absence.
Updated by okurz almost 2 years ago
- Related to action #122308: Handle invalid openQA job references in qem-dashboard size:M added
Updated by okurz almost 2 years ago
- Status changed from Feedback to Workable
#122308 was resolved. So next step can be for me to work with tinita on https://github.com/openSUSE/qem-bot/pull/103 and then to crosscheck again if there are any more useful refactorings pending.
Updated by livdywan almost 2 years ago
- Due date deleted (
2023-01-20)
I think the Due Date should've been reset here but wasn't.
Updated by okurz almost 2 years ago
- Assignee deleted (
okurz)
This needs to be picked up by someone else. I failed to accomodate it
Updated by okurz over 1 year ago
- Priority changed from Normal to High
This is the only thing we need to complete to finish off #91646 as well -> "High" prio now
Updated by jbaier_cz over 1 year ago
I am just wondering what are we missing here... the main functionality (AC1 and AC2) should be covered by the initial pull request https://github.com/openSUSE/qem-bot/pull/73 and then refined in some follow-ups, at the end we clarified the difference between dashboard job id and openqa job id and fixed that feature in https://github.com/openSUSE/qem-bot/pull/109.
All ACs are covered by a corresponding test.
Updated by okurz over 1 year ago
Two things left to do:
- Refactor according to #107923#note-35
- Verify ACs in production, not just from unit tests, i.e. reference an example from logs where non-tools-team users could effectively ignore an aggregate test for one incident
Updated by jbaier_cz over 1 year ago
- Status changed from Workable to In Progress
- Assignee set to jbaier_cz
Updated by jbaier_cz over 1 year ago
- Tags deleted (
mob)
Revert the retry disabling commit and see if the 404 retries are really gone: https://github.com/openSUSE/qem-bot/pull/113
Updated by openqa_review over 1 year ago
- Due date set to 2023-03-18
Setting due date based on mean cycle time of SUSE QE Tools
Updated by jbaier_cz over 1 year ago
- Status changed from In Progress to Feedback
I created a small PR https://github.com/openSUSE/qem-bot/pull/114 to slightly improve the output of the Approver. Now it should also output a link to the failed / ignored openQA job
Updated by mkittler over 1 year ago
The most recent usage of the feature is https://openqa.suse.de/tests/10478348#comments (found via select job_id, user_id, text from comments where text like '%review%acceptable%' and user_id != 126;
). Not sure whether that's good enough to verify that the feature works.
Updated by jbaier_cz over 1 year ago
- Status changed from Feedback to Resolved
We have a new confirmation today in https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1444928
2023-03-09 11:04:15 INFO Ignoring failed job https://openqa.suse.de/t10643070 for incident 27492 due to openQA comment
2023-03-09 11:04:15 INFO Ignoring failed job https://openqa.suse.de/t10643071 for incident 27492 due to openQA comment
2023-03-09 11:04:15 INFO Ignoring failed job https://openqa.suse.de/t10643068 for incident 27492 due to openQA comment
2023-03-09 11:04:15 INFO Ignoring failed job https://openqa.suse.de/t10643069 for incident 27492 due to openQA comment
2023-03-09 11:04:15 INFO Found failed, not-ignored job https://openqa.suse.de/t10646456 for incident 27492
2023-03-09 11:04:15 INFO SUSE:Maintenance:27492:290872 has at least one failed job in aggregate tests
Updated by MDoucha over 1 year ago
@review:acceptable_for:...
comments are not carried over between jobs. This makes the feature no better than force_result
.
Example: https://openqa.suse.de/tests/11750829#comments
Comment missing here: https://openqa.suse.de/tests/11788157#comments
Updated by okurz over 1 year ago
MDoucha wrote:
@review:acceptable_for:...
comments are not carried over between jobs. This makes the feature no better thanforce_result
.Example: https://openqa.suse.de/tests/11750829#comments
Comment missing here: https://openqa.suse.de/tests/11788157#comments
Yes, that's as designed so that no failure stays unnoticed without a bug reference to remind in
Updated by MDoucha over 1 year ago
okurz wrote:
Yes, that's as designed so that no failure stays unnoticed without a bug reference to remind in
@review:acceptable_for
does not change the job result. It stays as failed
. The purpose of this feature is to whitelist all identical failures for a single incident. It needs to be carried over.
Updated by okurz over 1 year ago
MDoucha wrote:
okurz wrote:
Yes, that's as designed so that no failure stays unnoticed without a bug reference to remind in
@review:acceptable_for
does not change the job result. It stays asfailed
. The purpose of this feature is to whitelist all identical failures for a single incident. It needs to be carried over.
Well, labels are not carried over but you can combine bugrefs and labels in a single openQA comment. So just put the label plus a bugref and the carry over should happen.
Updated by MDoucha over 1 year ago
okurz wrote:
Well, labels are not carried over but you can combine bugrefs and labels in a single openQA comment. So just put the label plus a bugref and the carry over should happen.
Do you realize how dangerous it is that such a "workaround" is even possible? Somebody could do that by accident on a force_result label.
Also, no. If restarting a job means that we'll have to redo all the "acceptable_for" labels, the feature is not finished.
Updated by okurz over 1 year ago
MDoucha wrote:
Do you realize how dangerous it is that such a "workaround" is even possible? Somebody could do that by accident on a force_result label.
Also, no. If restarting a job means that we'll have to redo all the "acceptable_for" labels, the feature is not finished.
I could create a ticket and try to write down what you might want to have but as is apparent that failed apparently so please collaborate with vpelcak to write down what you actually need and then we will use that as a base for discussion how to improve and eventually implement such feature after we agreed upon the exact requirements. Simply "enabling carry over" would conflict with other requirements.
Updated by MDoucha over 1 year ago
okurz wrote:
Simply "enabling carry over" would conflict with other requirements.
Could you be more specific? I don't see any conflict with the 3 acceptance criteria in this ticket and if I'm missing something, it's important information for creating the new ticket.
Updated by okurz over 1 year ago
MDoucha wrote:
okurz wrote:
Well, labels are not carried over but you can combine bugrefs and labels in a single openQA comment. So just put the label plus a bugref and the carry over should happen.
Do you realize how dangerous it is that such a "workaround" is even possible? Somebody could do that by accident on a force_result label.
Well, yes, that accident can happen but I don't see that as something that we can change while still keeping the possibility to allow this. We don't see it as a workaround, please see further below for my references to documentation.
Also, no. If restarting a job means that we'll have to redo all the "acceptable_for" labels, the feature is not finished.
Well, that's why you have a choice
MDoucha wrote:
okurz wrote:
Simply "enabling carry over" would conflict with other requirements.
Could you be more specific? I don't see any conflict with the 3 acceptance criteria in this ticket and if I'm missing something, it's important information for creating the new ticket.
ok, let me try. You mentioned acceptance criteria of the current ticket, let me start with those.
okurz wrote:
Acceptance criteria¶
- AC1: A not-ok openQA job with a comment following format https://progress.opensuse.org/issues/95479#Suggestions is not blocking approval of incident updates
this is fulfilled regardless if such acceptable_for-comment is carried over or not so we shouldn't be concerned about that
- AC2: A not-ok openQA job with such comment is still blocking approval of all other, not specified incident updates
Right now this is fulfilled because only the incident in the acceptable_for-comment would be concerned maybe even independant of carry-over although there might be cases which I can't think of right now where a acceptable_for-comment is carried over until it would erroneously match another, unrelated test failure for maybe the same incident or a new patch for different OS version for the same incident number. So to be on the safe side here the more conservative approach of "no carry-over" helps.
- AC3: A not-ok openQA without such comment is still blocking all related incidents
that should also be ok regardless if a special comment is carried over or not as we expect that a special comment is only having an effect on each job where it is linked to.
#95479-38 and #95479-42 from dzedro explicitly asked for no carry over which happened in that case because a comment included a combined label+bugref.
Next to the acceptance criteria mentioned in this ticket at least one other requirement comes from the design of openQA "labels and bugrefs" as documented and defined in http://open.qa/docs/#_bug_references_and_labels which says "A comment containing a bug reference will also be carried over to reduce manual review work." and for labels it says "A label containing a bug reference will still be treated as a label, not a bugref. The bugref will still be rendered as a link. That means no bug icon is shown and the comment does not become subject to carry over." and further on "force_result-labels are evaluated when when a comment is carried over. However, the carry over will only happen when the comment also contains a bug reference."
So this brings me back to my earlier suggestion that in the case that you describe likely the best is to combine within one comment an acceptable_for-label together with a bugref, e.g.
@review:acceptable_for:incident_30146:bsc#1212271
bsc#1212271
If you do not want to duplicate the bug link then what will likely fulfill the regex check in https://github.com/openSUSE/qem-bot/pull/73/files#diff-d44cab80bf43c62865043529f6e9883cc12ab37d49f5fb5af9e1bdc7c012dcaaR87 as well
@review:acceptable_for:incident_30146:.
bsc#1212271
Updated by MDoucha over 1 year ago
okurz wrote:
- AC2: A not-ok openQA job with such comment is still blocking approval of all other, not specified incident updates
Right now this is fulfilled because only the incident in the acceptable_for-comment would be concerned maybe even independant of carry-over although there might be cases which I can't think of right now where a acceptable_for-comment is carried over until it would erroneously match another, unrelated test failure for maybe the same incident or a new patch for different OS version for the same incident number. So to be on the safe side here the more conservative approach of "no carry-over" helps.
OK, that seems plausible at least for aggregate tests where all SLE versions are in the same job group. In that case, we need a solution built into the QE dashboard.
Updated by okurz about 1 year ago
- Due date deleted (
2023-03-18)
MDoucha wrote in #note-72:
OK, that seems plausible at least for aggregate tests where all SLE versions are in the same job group. In that case, we need a solution built into the QE dashboard.
Feel welcome to add your suggestion to #65271 . Right now I don't see that SUSE QE Tools team can provide any significant contribution in this direction.
Updated by okurz about 1 year ago
- Related to action #135782: auto-review+force-result ticket does not seem to have an effect when issue tracker changed after the initial comment when carry-over is effective added
Updated by livdywan about 1 year ago
- Related to action #136244: Ensure arbitrary comments can be taken over to new jobs size:M added