Project

General

Profile

action #113797

Automated alerts and reminders about SLO's for openqatests size:M

Added by okurz 2 months ago. Updated 12 days ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
Start date:
2022-07-19
Due date:
% Done:

0%

Estimated time:

Description

Motivation

Same as in other processes https://progress.opensuse.org/projects/openqatests/wiki#SLOs-service-level-objectives
is followed better if there are automations that ensure the process, remind about missed targets, etc.

Acceptance criteria

  • AC1: SLOs are ensured with automation

Suggestions


Related issues

Copied to QA - action #116545: Automated alerts and reminders about SLO's for openqatests size:MBlocked

History

#1 Updated by cdywan 2 months ago

  • Subject changed from Automated alerts and reminders about https://progress.opensuse.org/projects/openqatests/wiki#SLOs-service-level-objectives to Automated alerts and reminders about https://progress.opensuse.org/projects/openqatests/wiki#SLOs-service-level-objectives size:M
  • Description updated (diff)
  • Status changed from New to Workable

#2 Updated by okurz 2 months ago

#3 Updated by cdywan about 2 months ago

  • Subject changed from Automated alerts and reminders about https://progress.opensuse.org/projects/openqatests/wiki#SLOs-service-level-objectives size:M to Automated alerts and reminders about SLO's for openqatests size:M
  • Assignee set to cdywan

okurz wrote:

better automate the "text template for update comments"

If you mean automated comments based on SLO's as implemented for the backlog of the Tools team then yes I am considering that, too. Actually as a first step I took a look at qe-c-backlog-assistant (which is a fork of the same project) including a conversation as part of the squad of squads (a call with other SMs). I think both are probably useful, and maybe we can make our life easier by enabling both based on the same approach i.e. codify generated markdown, reminders and text snippets in the same configuration.

#4 Updated by cdywan about 2 months ago

  • Status changed from Workable to In Progress

I've prepared a new project which attemps to do that now. No full support for QE-C style multiple groups for now (which is not a goal of this ticket). queries.yaml is self-contained and the backlogger be executed locally as a single command.

Next steps are publishing the new backlogger codebase which can implement qa-tools-backlog-assistant, and a new project e.g. openqatests-backlog-status providing a webpage and reminder comments without custom code.

#5 Updated by openqa_review about 2 months ago

  • Due date set to 2022-08-23

Setting due date based on mean cycle time of SUSE QE Tools

#6 Updated by cdywan about 2 months ago

  • The backlogger project implements a generic version of the Tools backlog checker. It includes a GHA which can be used without(!) copying the code. Only a queries.yaml is needed. The format is slightly changed to use a list and it only needs to be processed once. As a side effect locally running it is also nicer.
  • The openQA tests backlog project uses this. It looks basically like the Tools backlog with its own page and refreshes automatically. This project contains no code (not counting GHA workflows)
    • Right now the URL's are copied and the "qa projects" are absent. I'm thinking to save queries and use them, because copying URLs is just as bad as copying code.

#7 Updated by cdywan about 2 months ago

I prepared a branch for the automatic reminder comments based on the existing queries.yaml. The template is a bit simpler and to keep it simple only includes the priority since that can be read from the ticket.

  • Right now the URL's are copied and the "qa projects" are absent. I'm thinking to save queries and use them, because copying URLs is just as bad as copying code.

I also prepared a branch to enable comments and add the missing queries to openqa-tests-backlog. For now without preparing saved queries.

#8 Updated by cdywan about 2 months ago

cdywan wrote:

I prepared a branch for the automatic reminder comments based on the existing queries.yaml. The template is a bit simpler and to keep it simple only includes the priority since that can be read from the ticket.

  • Right now the URL's are copied and the "qa projects" are absent. I'm thinking to save queries and use them, because copying URLs is just as bad as copying code.

I also prepared a branch to enable comments and add the missing queries to openqa-tests-backlog. For now without preparing saved queries.

I provided an example of a working comment based on a manual run (no comments will be added from a PR), see the PR and the comment.

#9 Updated by cdywan about 1 month ago

Of course since this is handling JSON I ran into a case of JSONDecodeError. I could not reproduce the issue locally. What I did do is improve the error handling so "similar" issues I could come up with would produce 404 or 401 and confirm what the specific error is. Right now that's impossible to see from the pipeline.

#10 Updated by cdywan about 1 month ago

Turns out the problem was that the PR from my fork has no access to the REDMINE_API_KEY secret. It works so long as I propose it from a branch in upstream.

By chance I discovered that the preview action also doesn't currently support this. It might be worth looking into their work-around once that's done. Tina also suggested we could use a dispatch upstream to work-around the restriction.
I would refrain from looking into work-arounds for now, though, and rather use branches on the project.

#11 Updated by cdywan about 1 month ago

For the record, since I got confused with that briefly, reminder comments always clear the queries since they filter by updated. No need to worry about how often we send them.
I also sent out a batch of reminders just now, to confirm that.

There's one more point, maybe we should use a bot account. I'm reading up how to create one. That should be the last step

#12 Updated by okurz about 1 month ago

Just create a new account from a private browser window. Otherwise for now you can also use the existing "openqa_review" account

#13 Updated by cdywan about 1 month ago

okurz wrote:

Just create a new account from a private browser window. Otherwise for now you can also use the existing "openqa_review" account

It seems that's what I have to do. Creating an account from within Redmine leaves me unable to login. I created one via IDN and replaced the API key in openqa-tests-backlog

#14 Updated by cdywan about 1 month ago

  • Status changed from In Progress to Feedback

Let's see how this works out in practice. Some comments have already been sent out under my name, and future ones will be sent by slo-gin who should have access to the relevant projects.

#15 Updated by cdywan about 1 month ago

One issue with the Within due date query was found, which was missing the "last updated by" field. The due date was not updated, causing repeated comments to be added. I addressed this by updating the query.

#17 Updated by cdywan about 1 month ago

cdywan wrote:

One issue with the Within due date query was found, which was missing the "last updated by" field. The due date was not updated, causing repeated comments to be added. I addressed this by updating the query.

Proper fix upstream by not sending out reminders for "due date" queries merged. Workflow re-enabled.

#18 Updated by cdywan about 1 month ago

  • Due date changed from 2022-08-23 to 2022-08-26

For the record I was keeping this in Feedback since the last fix-up contained a typo. So instead of a one-character fix I'd like to add minimal unit tests based on the previous proof of concept. This is not blocked, though, just delayed because of other on-going tasks.

#19 Updated by cdywan about 1 month ago

cdywan wrote:

For the record I was keeping this in Feedback since the last fix-up contained a typo. So instead of a one-character fix I'd like to add minimal unit tests based on the previous proof of concept. This is not blocked, though, just delayed because of other on-going tasks.

As discussed I'm keeping the tests minimal. However I did need MagicMock to actually test if a comment is posted: https://github.com/kalikiana/backlogger/pull/2

#20 Updated by cdywan about 1 month ago

Also I moved the repo out of my personal account because otherwise the team won't have full access to everything, so now it's openSUSE/backlogger

#21 Updated by cdywan 28 days ago

  • Status changed from Feedback to Resolved

Everything appears to work as expected now

#22 Updated by okurz 14 days ago

  • Due date deleted (2022-08-26)
  • Status changed from Resolved to Feedback
  • Priority changed from Normal to High

I see some important open points we need to discuss. Also problems with the current implementation that I would like to address. This was also brought up in a discussion with jstehlik so not only noticed by me. https://progress.opensuse.org/projects/openqatests/wiki#SLOs-service-level-objectives mentions two different approaches for the "first" and "second" reminder. Currently the bot defeats the purpose of SLOs because the queries on the mentioned wiki pages hardly ever show tickets again as the bot already puts a comment on each ticket so that they don't appear as "not updated recently". And also the bot adds the same reminder multiple times like for example in https://progress.opensuse.org/issues/113552#note-11 when instead we should have a second reminder which automatically lowers priority as well. How about the following suggestions:

  1. In https://progress.opensuse.org/projects/openqatests/wiki#SLOs-service-level-objectives explain what a bot is handling automatically already and referencing to instructions how to update the rules that the bot is following. It can be just a link to https://github.com/openSUSE/openqa-tests-backlog/blob/main/queries.yaml , right?
  2. As the queries in https://github.com/openSUSE/openqa-tests-backlog/blob/main/queries.yaml#L5 are not named queries but defined in place we could just define a "grace period" for each query and only act automatically if not done already by users, e.g. don't remind on urgent tickets after 7 days but only 7+2 days
  3. Same as in queries for QA tools we likely should only look at the update time of tickets with no subtasks, e.g. see the definition of https://progress.opensuse.org/issues?query_id=542, to prevent cases like https://progress.opensuse.org/issues/113749#change-552298
  4. Only write the same comment once or better follow the SLO about the suggestion of the "second reminder"

#23 Updated by cdywan 13 days ago

okurz wrote:

  1. In https://progress.opensuse.org/projects/openqatests/wiki#SLOs-service-level-objectives explain what a bot is handling automatically already and referencing to instructions how to update the rules that the bot is following. It can be just a link to https://github.com/openSUSE/openqa-tests-backlog/blob/main/queries.yaml , right?

Good point. I added notes regarding the backlog status as well as the bot configuration.

  1. As the queries in https://github.com/openSUSE/openqa-tests-backlog/blob/main/queries.yaml#L5 are not named queries but defined in place we could just define a "grace period" for each query and only act automatically if not done already by users, e.g. don't remind on urgent tickets after 7 days but only 7+2 days
  2. Same as in queries for QA tools we likely should only look at the update time of tickets with no subtasks, e.g. see the definition of https://progress.opensuse.org/issues?query_id=542, to prevent cases like https://progress.opensuse.org/issues/113749#change-552298
  3. Only write the same comment once or better follow the SLO about the suggestion of the "second reminder"

These points have been brought up and discussed before, so if anything that tells me these are not minor details. I suggest you file according follow-up tickets and we can make sure it's understood what the mechanics are (or an epic if you prefer, no strong preference, but please don't make this ticket a moving target).

#24 Updated by okurz 12 days ago

  • Copied to action #116545: Automated alerts and reminders about SLO's for openqatests size:M added

#25 Updated by okurz 12 days ago

  • Status changed from Feedback to Resolved

cdywan wrote:

okurz wrote:

  1. In https://progress.opensuse.org/projects/openqatests/wiki#SLOs-service-level-objectives explain what a bot is handling automatically already and referencing to instructions how to update the rules that the bot is following. It can be just a link to https://github.com/openSUSE/openqa-tests-backlog/blob/main/queries.yaml , right?

Good point. I added notes regarding the backlog status as well as the bot configuration.

Looks good, thx.

These points have been brought up and discussed before, so if anything that tells me these are not minor details. I suggest you file according follow-up tickets and we can make sure it's understood what the mechanics are (or an epic if you prefer, no strong preference, but please don't make this ticket a moving target).

Sure, done with #116545. Based on our effort estimations we can split then.

Also available in: Atom PDF