Project

General

Profile

Actions

action #135647

closed

Separate SLOs+SLAs size:M

Added by okurz over 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Start date:
2023-09-13
Due date:
2023-10-27
% Done:

0%

Estimated time:

Description

Motivation

We have https://progress.opensuse.org/projects/qa/wiki/tools#SLOs-service-level-objectives . To be able to ensure that we meet expectations I suggest we set internal objectives one order of magnitude smaller than SLAs which we promise to stakeholders, e.g. We promise to react to an urgent ticket at least once every week but internally we ensure we update it at least once a day and accordingly for other priority levels. That should be an agreement within the team along with according documentation, helping ticket queries and supporting tooling.

Acceptance Criteria

Suggestions

  • DONE Initiate a call or Slack thread discussing aspects of this idea
  • DONE Rebrand the existing SLOs as SLAs
  • DONE Consider rephrasing the current wiki text on https://progress.opensuse.org/projects/qa/wiki/tools#SLOs-service-level-objectives which only mentions "picking up", not updating already picked up tickets
  • DONE Propose that our internal SLO's would be one order of magnitude below the SLAs
  • DONE Suggest a rule we can use to calculate between SLOs (internal) and SLAs (externals), e.g. ticket queries by priority
  • DONE Confirm that it is worth having this specified in detail e.g. maybe the team is already making such assumptions?
  • DONE* Consider alternatives
  • Make sure we have some dashboard available, e.g. either ticket queries or https://os-autoinst.github.io/qa-tools-backlog-assistant/

Further details

  • What are Service Level Objectives?
    • An objective is a goal that we aim for
    • This is our ideal, not necessarily what we do in practice in all cases
    • For us, as a team, internal facing
  • What are Service Level Agreements?
    • An agreement is something we consider ourselves bound to
    • If we can't follow up we need a reality check or ensure we can meet the agreements
    • For our stakeholders, external facing

Related issues 1 (0 open1 closed)

Related to QA (public) - action #138314: requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://progress.opensuse.org/issues.json?query_id=830Rejectedokurz2023-10-21

Actions
Actions #1

Updated by livdywan about 1 year ago

  • Subject changed from Separate SLOs+SLAs to Separate SLOs+SLAs size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #2

Updated by okurz about 1 year ago

  • Priority changed from Normal to Urgent
Actions #3

Updated by okurz about 1 year ago

wiki updated and proposed https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/37 for rebranding existing SLO queries as SLAs

Actions #4

Updated by livdywan about 1 year ago

  • Description updated (diff)
  • Status changed from Workable to In Progress
  • Assignee set to okurz
Actions #5

Updated by okurz about 1 year ago

  • Status changed from In Progress to Feedback
  • Priority changed from Urgent to High

Liv gladly invited to discuss the topic in our mob session which we successfully did. I updated the wiki and proposed the rebrand of automatic queries with https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/37

Actions #6

Updated by okurz about 1 year ago

  • Status changed from Feedback to In Progress

https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/37 merged. I suggest we make our daily more structured with mandatory questions to answers, e.g.:

  1. Backlog checks green?
  2. Time critical issues needing handling?
  3. What was achieved since the last time?
  4. Who needs help?
  5. Plans until next time?

So, what does everybody think? If nobody objects I would note that on our team wiki and suggest whoever moderates the daily meeting and text chat to follow that

Actions #7

Updated by okurz about 1 year ago

  • Description updated (diff)

fixed description broken by liv's copy from etherpad

Actions #8

Updated by openqa_review about 1 year ago

  • Due date set to 2023-10-27

Setting due date based on mean cycle time of SUSE QE Tools

Actions #9

Updated by okurz about 1 year ago

  • Due date deleted (2023-10-27)
  • Status changed from In Progress to Resolved

wiki updated about daily process and no disagreement :) wiki is up-to-date, queries are ok, ACs fulfilled

Actions #10

Updated by livdywan about 1 year ago

  • Status changed from Resolved to In Progress
  • Assignee changed from okurz to livdywan

We missed something important here, namely queries for our SLO's. Otherwise we still won't see that we're behind which is what we were discussing before. Consequently I'm creating those now.

Actions #11

Updated by okurz about 1 year ago

My understanding was that we don't need/want those as they are much more likely to fail but let's try.

Actions #12

Updated by livdywan about 1 year ago

  • Status changed from In Progress to Feedback

okurz wrote in #note-11:

My understanding was that we don't need/want those as they are much more likely to fail but let's try.

I've not added them to the status page yet. We might even want something else. We should have made this an experiment like we usually do so we can check if this helps, and see what we can further improve.

Actions #14

Updated by okurz about 1 year ago

https://github.com/os-autoinst/qa-tools-backlog-assistant/actions/runs/6597752651/job/17925059892#step:3:286
Shows
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://progress.opensuse.org/issues.json?query_id=830
I think the queries that livdywan created are not publically available, I will revert the PR

Actions #15

Updated by okurz about 1 year ago

@livdywan I reverted the PR https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/41 . As I don't seem to have access to the queries in some way, getting an error on my smartphone browser, please make sure the queries you created are accessible in the same way than the others we use on the status dashboard, then recreate the PR to bring in the queries

Actions #16

Updated by okurz about 1 year ago

  • Related to action #138314: requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://progress.opensuse.org/issues.json?query_id=830 added
Actions #17

Updated by okurz about 1 year ago

Fixed queries, created https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/42, same content as before. merged. https://os-autoinst.github.io/qa-tools-backlog-assistant/ shows the updated content with green SLO queries now, good enough?

Actions #18

Updated by livdywan about 1 year ago

  • Due date set to 2023-10-27
  • Priority changed from High to Normal

I'd still like to confirm that the new SLO's work for everyone. Let's do that in the retro this week.

Actions #19

Updated by livdywan about 1 year ago

  • Status changed from Feedback to Resolved

livdywan wrote in #note-18:

I'd still like to confirm that the new SLO's work for everyone. Let's do that in the retro this week.

We seem to be okay with the definitions. Quirks of Redmine calculating the hours in the day wrong came up, and for now we'll have it check 2 days for Urgent. And there's #137828 - I have a proof of concept for Slack reminders, it's pretty straight-forward and we can see what's useful there.

Actions

Also available in: Atom PDF