action #81106
closedcoordination #69310: [epic] SUSE QA tools team ticket process helpers
test out chat service notifications, e.g. matrix, from github actions size:M
Description
Motivation¶
Based on #69310 we find useful limits, SLOs, queries to check but we are unsure about notifications, e.g. how would we react to GHA pipeline failures. Maybe it helps to receive slack notifications
Acceptance criteria¶
- AC1: GHA pipeline failures in os-autoinst/scripts trigger matrix notifications
- AC2: team members are aware of where the notifications come from and what needs to be done
- AC3: We are not annoyed by too many frickin' unactionable alerts
Suggestions¶
For slack one can
try out
Slack Messaging
in https://github.com/os-autoinst/scripts GHA pipelines
Don't spam channels but definitely not community channels :)
Updated by okurz almost 4 years ago
- Project changed from openQA Project to QA
- Category deleted (
Feature requests)
Updated by okurz almost 4 years ago
- Related to action #77317: chat bot to conduct daily checks, alerts, reminders, etc. added
Updated by okurz over 3 years ago
- Target version changed from Ready to future
this was an idea by the team and for the team but apparently there is not much interest so I will move it out of the backlog for now
Updated by okurz over 3 years ago
- Subject changed from test out rocket chat notifications from github actions to test out chat service notifications, e.g. rocket chat, from github actions
Updated by livdywan about 3 years ago
- Subject changed from test out chat service notifications, e.g. rocket chat, from github actions to test out chat service notifications, e.g. slack, from github actions
- Description updated (diff)
Updated by okurz about 3 years ago
- Subject changed from test out chat service notifications, e.g. slack, from github actions to test out chat service notifications, e.g. matrix, from github actions
- Description updated (diff)
- Target version changed from future to Ready
I am not really motivated to support proprietary tools before free software so I suggest to use matrix/element first
Updated by VANASTASIADIS about 3 years ago
- Status changed from Workable to In Progress
Updated by VANASTASIADIS about 3 years ago
There is a working matrix solution (currently on a personal test repo, but easily reproducible in any workflow with minor additions). I am wondering however:
- currently in the
scripts
repo there are 2 jobs, containing 3 steps in the workflow: JOB 1: a) check WIP limits b) set due dates JOB 2: c) run ci tests on push
I'm wondering: since the first job is a scheduled job, should a notification be sent in every failure? That's certainly easier (and less complex), but if for example the job fails on a Friday evening, by monday we'd have a lot of spam in the chat.
Another solution would be saving the previous condition in a file in the repo, and comparing to see if anything changed. That would lead to notifications only the first time something fails. But it would still miss other cases: for example, a case where the same job/step fails but for a different reason.
I think it's simpler in the case of https://github.com/os-autoinst/qa-tools-backlog-assistant: notify only on status change for every query. If "overall" backlog is off limits, you get one notification: the next one will be when it's inside limits again.
Depending on the job in question, different ways of notifying may be prefferable: for some jobs only once on every change, on others only one report in fixed intervals... I would appreciate other opinions and thoughts here.
Updated by okurz about 3 years ago
- Subject changed from test out chat service notifications, e.g. matrix, from github actions to test out chat service notifications, e.g. matrix, from github actions size:M
- Description updated (diff)
Updated by livdywan about 3 years ago
VANASTASIADIS wrote:
There is a working matrix solution (currently on a personal test repo, but easily reproducible in any workflow with minor additions). I am wondering however:
What solution is that? Do you have a proof of concept implementing this?
- currently in the
scripts
repo there are 2 jobs, containing 3 steps in the workflow: JOB 1: a) check WIP limits b) set due dates JOB 2: c) run ci tests on pushI'm wondering: since the first job is a scheduled job, should a notification be sent in every failure? That's certainly easier (and less complex), but if for example the job fails on a Friday evening, by monday we'd have a lot of spam in the chat.
What do you consider "a lot"? I would suggest to aim for one notification a day. I don't care if we see the exact same the day after, that just means we need to catch up.
Updated by VANASTASIADIS about 3 years ago
Updated by VANASTASIADIS about 3 years ago
- Status changed from In Progress to Feedback
Updated by livdywan about 3 years ago
We still need these items here:
- We need #suse-qe-tools to receive notifications
- Add secrets to the GitHub pipeline
- We need a bot account for Matrix
Updated by okurz about 3 years ago
- Related to action #102059: Integrate the Slack feed notifications feature for progress queries added
Updated by okurz about 3 years ago
@VANASTASIADIS what are your own plans on this? Should we unassign you and pick it up within SUSE QE Tools?
Updated by VANASTASIADIS almost 3 years ago
- Assignee deleted (
VANASTASIADIS)
@okurz So, for matrix this should be working, as long as someone adds MATRIX_ACCESS_TOKEN
and MATRIX_ROOM_ID
to the secrets. I don't have the permissions to add secrets, so someone with the appropriate permissions should do that.
In addition, I see that this has rolled back to being slack-centric. So I will unassign and feel free to assign and add the secrets, or proceed with a slack implementation. If it's not too urgent and you have no free hands, I can tackle the slack implementation too, after I'm done with a couple of qe-core tickets. Ping me in that case.
Updated by okurz almost 3 years ago
- Status changed from Feedback to New
Actually hasn't rolled back to being "slack-centric". For Slack I was merely using quite a different feature in the related but not same ticket #102059
Updated by mkittler almost 3 years ago
It is still not clear to me whether this should now be implemented for Matrix or for Slack.
Considering https://github.com/os-autoinst/scripts/pull/108 it seems the Matrix part might have already been concluded and only Slack is left. Even if that's the case I'm still wondering where the Matrix messages end up as the room name it literally a secret (and one can only update/remove them).
Updated by okurz almost 3 years ago
- Target version changed from Ready to future
mkittler wrote:
It is still not clear to me whether this should now be implemented for Matrix or for Slack.
Considering https://github.com/os-autoinst/scripts/pull/108 it seems the Matrix part might have already been concluded and only Slack is left. Even if that's the case I'm still wondering where the Matrix messages end up as the room name it literally a secret (and one can only update/remove them).
Yes, the matrix part is only done when we actually see messages, not in before :) The priority should be 1. Matrix, then 2. Slack. But as this apparently wasn't interesting for the team to pick up for some time I will for now remove the ticket from our backlog again until someone or something convinces me again :)
Updated by okurz almost 2 years ago
- Status changed from Workable to Rejected
- Assignee set to okurz
okurz wrote:
I am not really motivated to support proprietary tools before free software so I suggest to use matrix/element first
I give up ¯(°_o)/¯ #102059 must suffice for now