action #135647
closed
Separate SLOs+SLAs size:M
Added by okurz over 1 year ago.
Updated about 1 year ago.
Description
Motivation¶
We have https://progress.opensuse.org/projects/qa/wiki/tools#SLOs-service-level-objectives . To be able to ensure that we meet expectations I suggest we set internal objectives one order of magnitude smaller than SLAs which we promise to stakeholders, e.g. We promise to react to an urgent ticket at least once every week but internally we ensure we update it at least once a day and accordingly for other priority levels. That should be an agreement within the team along with according documentation, helping ticket queries and supporting tooling.
Acceptance Criteria¶
Suggestions¶
- DONE Initiate a call or Slack thread discussing aspects of this idea
- DONE Rebrand the existing SLOs as SLAs
- DONE Consider rephrasing the current wiki text on https://progress.opensuse.org/projects/qa/wiki/tools#SLOs-service-level-objectives which only mentions "picking up", not updating already picked up tickets
- DONE Propose that our internal SLO's would be one order of magnitude below the SLAs
- DONE Suggest a rule we can use to calculate between SLOs (internal) and SLAs (externals), e.g. ticket queries by priority
- DONE Confirm that it is worth having this specified in detail e.g. maybe the team is already making such assumptions?
- DONE* Consider alternatives
- Make sure we have some dashboard available, e.g. either ticket queries or https://os-autoinst.github.io/qa-tools-backlog-assistant/
Further details¶
- What are Service Level Objectives?
- An objective is a goal that we aim for
- This is our ideal, not necessarily what we do in practice in all cases
- For us, as a team, internal facing
- What are Service Level Agreements?
- An agreement is something we consider ourselves bound to
- If we can't follow up we need a reality check or ensure we can meet the agreements
- For our stakeholders, external facing
- Subject changed from Separate SLOs+SLAs to Separate SLOs+SLAs size:M
- Description updated (diff)
- Status changed from New to Workable
- Priority changed from Normal to Urgent
- Description updated (diff)
- Status changed from Workable to In Progress
- Assignee set to okurz
- Status changed from In Progress to Feedback
- Priority changed from Urgent to High
- Status changed from Feedback to In Progress
https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/37 merged. I suggest we make our daily more structured with mandatory questions to answers, e.g.:
- Backlog checks green?
- Time critical issues needing handling?
- What was achieved since the last time?
- Who needs help?
- Plans until next time?
So, what does everybody think? If nobody objects I would note that on our team wiki and suggest whoever moderates the daily meeting and text chat to follow that
- Description updated (diff)
fixed description broken by liv's copy from etherpad
- Due date set to 2023-10-27
Setting due date based on mean cycle time of SUSE QE Tools
- Due date deleted (
2023-10-27)
- Status changed from In Progress to Resolved
wiki updated about daily process and no disagreement :) wiki is up-to-date, queries are ok, ACs fulfilled
- Status changed from Resolved to In Progress
- Assignee changed from okurz to livdywan
We missed something important here, namely queries for our SLO's. Otherwise we still won't see that we're behind which is what we were discussing before. Consequently I'm creating those now.
My understanding was that we don't need/want those as they are much more likely to fail but let's try.
- Status changed from In Progress to Feedback
okurz wrote in #note-11:
My understanding was that we don't need/want those as they are much more likely to fail but let's try.
I've not added them to the status page yet. We might even want something else. We should have made this an experiment like we usually do so we can check if this helps, and see what we can further improve.
- Related to action #138314: requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://progress.opensuse.org/issues.json?query_id=830 added
- Due date set to 2023-10-27
- Priority changed from High to Normal
I'd still like to confirm that the new SLO's work for everyone. Let's do that in the retro this week.
- Status changed from Feedback to Resolved
livdywan wrote in #note-18:
I'd still like to confirm that the new SLO's work for everyone. Let's do that in the retro this week.
We seem to be okay with the definitions. Quirks of Redmine calculating the hours in the day wrong came up, and for now we'll have it check 2 days for Urgent. And there's #137828 - I have a proof of concept for Slack reminders, it's pretty straight-forward and we can see what's useful there.
Also available in: Atom
PDF