Project

General

Profile

Actions

action #121582

closed

[tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously size:M

Added by okurz about 1 year ago. Updated 11 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Start date:
2022-12-06
Due date:
2023-03-31
% Done:

0%

Estimated time:

Description

Motivation

We collected cycle and lead times already in the past with some scripts. Then it was added to monitor.qa.suse.de but eventually stopped working and nobody resurrected it yet. LSG QE tracks metrics and tries to expand, see #118135 so we should contribute to that by bringing back cycle and lead time evaluations that can help us in our daily work.

Acceptance criteria

  • AC1: Up-to-date cycle and lead times for SUSE QE Tools can be found over monitor.qa.suse.de

Suggestions


Related issues 4 (1 open3 closed)

Related to openQA Tests - action #47891: [functional][u] Continuously update the QSF-u team charts (cycle time, etc.) to know how we performResolvedmgriessmeier2019-02-14

Actions
Copied to openQA Infrastructure - action #125765: Make Telegraf errors visible in alert handlingResolvedokurz2022-12-06

Actions
Copied to QA - action #126113: [tools][metrics] Only show queries in backlogger output that are relevant for the according output modeNew

Actions
Copied to QA - action #127025: [tools][metrics] Improve cycle + lead times in GrafanaResolvedokurz

Actions
Actions #2

Updated by livdywan about 1 year ago

  • Subject changed from [tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously to [tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #3

Updated by livdywan about 1 year ago

  • Status changed from Workable to In Progress
  • Assignee set to livdywan

I'm taking a look. Not sure what existing scripts there might actually be, but I'll try to ask a couple more people to see what I can re-use.

Actions #4

Updated by openqa_review about 1 year ago

  • Due date set to 2022-12-29

Setting due date based on mean cycle time of SUSE QE Tools

Actions #5

Updated by tinita about 1 year ago

This project from Ivan might be helpful: https://github.com/ilausuch/redmine_statistics

Actions #6

Updated by livdywan about 1 year ago

tinita wrote:

This project from Ivan might be helpful: https://github.com/ilausuch/redmine_statistics

Thanks for suggesting it. I also looked at stunning-octo-chainsaw (which measures numbers from GitHub if you couldn't guess from its name). Actually after taking a closer look and playing with it I realized we don't need to compute anything. Grafana will do that based on InfluxDB line protocol which periodically consumes total counts.

Output currently looks like so:

slo,team="QE Tools",title="Overall Backlog" count=91
slo,team="QE Tools",title="Workable Backlog" count=12
slo,team="QE Tools",title="Exceeding Due Date" count=0
slo,team="QE Tools",title="Untriaged QA" count=0
slo,team="QE Tools",title="Untriaged Tools Tagged" count=0
slo,team="QE Tools",title="SLO immediate (<1 day)" count=0
slo,team="QE Tools",title="SLO urgent (<1 week)" count=0
slo,team="QE Tools",title="SLO high (<1 month)" count=1
slo,team="QE Tools",title="SLO normal (<1 year)" count=0
slo,team="QE Tools",title="In Progress" count=5
slo,team="QE Tools",title="In Feedback" count=13
Actions #7

Updated by livdywan about 1 year ago

  • Due date changed from 2022-12-29 to 2023-01-13
  • Status changed from In Progress to Feedback

Will wait on feedback for now, with all pieces in place. Feel free to comment on the sample output to confirm if this is what we want or if we want other numbers. Not going to work on it til next year, though.

Actions #8

Updated by okurz about 1 year ago

I don't know how the above give cycle and lead times. Can you explain?

Actions #9

Updated by livdywan about 1 year ago

okurz wrote:

I don't know how the above give cycle and lead times. Can you explain?

Let's take an example. slo,team="QE Tools",title="Workable Backlog" count=12 tells us that the Workable Backlog has 12 issues in it. This is updated hourly (configured in salt). I've not prepared Grafana dashboards yet that would render this into a graph.

I assume this is what we want. If not please explain by example what you might be expecting.

Actions #10

Updated by tinita about 1 year ago

https://kanbanize.com/kanban-resources/kanban-software/kanban-lead-cycle-time

That's why I posted a link to Ivan's code, I think it might help

Actions #11

Updated by okurz about 1 year ago

@cdywan see the definition from the link that tinita posted

Actions #12

Updated by okurz about 1 year ago

  • Due date changed from 2023-01-13 to 2023-01-20

christmas grace due date bump :)

Actions #13

Updated by livdywan about 1 year ago

Here's an example including additional fields analoguous to what's implemented in the redmine_statistics project. I was hoping to get some early feedback on the minimal approach, but I guess I'm keeping it in the same branch now:

slo,team="QE Tools",status="New",title="Overall Backlog" count=6 avg=4.231111111111111 med=2.4391666666666665 std=48.9769975
slo,team="QE Tools",status="Workable",title="Overall Backlog" count=10 avg=15.109333333333334 med=18.405833333333334 std=49.890591755829895
slo,team="QE Tools",status="In Progress",title="Overall Backlog" count=4 avg=3.396111111111111 med=3.2463888888888888 std=6.536649537037036
slo,team="QE Tools",status="Blocked",title="Overall Backlog" count=5 avg=11.00961111111111 med=7.204722222222222 std=91.1066115200617
slo,team="QE Tools",status="Workable",title="Workable Backlog" count=10 avg=15.109333333333334 med=18.405833333333334 std=49.890591755829895
slo,team="QE Tools",status="Feedback",title="In Feedback" count=13 avg=6.791153846153847 med=3.6266666666666665 std=69.11117296612852
Actions #14

Updated by livdywan about 1 year ago

  • Status changed from Feedback to In Progress

This is in progress.

Actions #16

Updated by okurz about 1 year ago

  • Related to action #47891: [functional][u] Continuously update the QSF-u team charts (cycle time, etc.) to know how we perform added
Actions #17

Updated by okurz about 1 year ago

I looked up the original tickets #43442 and #47891, according code repo https://github.com/DrMullings/Scripts-Snippets-Stuff

Actions #18

Updated by okurz about 1 year ago

As discussed "lead time" could be implemented by looking at "time_when_resolved - time_when_created", "cycle time" could be "time_when_resolved - time_when_assigned" which has the drawback that tickets that are assigned but not actively worked on also account for the cycle time but I consider those rare exceptions that maybe we can then avoid if the cycle time alerts us about those. An alternative is to sum up all times when ticket is in progress or feedback minus ticket in new or workable but I would leave that out for now.

Actions #19

Updated by livdywan about 1 year ago

Talked about it briefly. @tinita raised the point that we should ideally measure each period of time a ticket is in progress, and scripts snippets stuff doesn't seem to handle that. I was taking a look at the journal data before, although it's not included in my branch so far, and I think that's doable.

Actions #20

Updated by livdywan about 1 year ago

okurz wrote:

As discussed "lead time" could be implemented by looking at "time_when_resolved - time_when_created", "cycle time" could be "time_when_resolved - time_when_assigned" which has the drawback that tickets that are assigned but not actively worked on also account for the cycle time

Right. That includes the user story "Kim is assigning themself to a ticket a few days before actively working on it".

Actions #21

Updated by livdywan about 1 year ago

  • Due date changed from 2023-01-20 to 2023-01-27
  • Status changed from In Progress to Workable

I'm not blocked here, but simply couldn't make time to work on the last step because of other tasks. The integration within Grafana also still needs to be tested, so even then we'd want to allow people to verify that we see according data on the dashboard.

Maybe I should for now make it Workable in case somebody else has spare cycles and is interested in working on it. Otherwise I plan to pick it up again next week.

Actions #22

Updated by livdywan about 1 year ago

  • Tags deleted (mob)

To figure out how the data needs to look Tina and I took an example and analyzed it from the raw data to the query used in a Grafana dashboard:

Transferring this to the context of cycle time we want 1) the number of resolved tickets and 2) the sum of time those tickets spent "in progress", calculated in the backlogger code. An example of multiple days, assuming we feed data daily and look at the data from the last day, could look like so:

day1: count_resolved=1 sum_cycle_time=14
day2: count_resolved=0 sum_cycle_time=0 (no data)
day3: count_resolved=2 sum_cycle_time=10

Grafana will process this data and make available the average/median/whatever we choose in a query for a given time span.

Actions #23

Updated by livdywan about 1 year ago

  • Due date deleted (2023-01-27)

Due date is not generally applicable to workable

Actions #25

Updated by livdywan about 1 year ago

Apparently @ilausuch is now pursuing some grand plan to feed all Redmine and Bugzilla tickets into Grafana and make this accessible to all teams, see #123541. So we may not want to duplicate the effort at this point?

Actions #26

Updated by okurz about 1 year ago

cdywan wrote:

Apparently @ilausuch is now pursuing some grand plan to feed all Redmine and Bugzilla tickets into Grafana and make this accessible to all teams, see #123541. So we may not want to duplicate the effort at this point?

I would be happy to learn more about that grand plan and how the QAC squad supports that. As long as that is just a private, personal idea I don't think we can rely on it. I would really prefer a simple approach for now that also provides something useful without relying on data in grafana. Like https://os-autoinst.github.io/qa-tools-backlog-assistant/ showing the cycle number?

Actions #27

Updated by livdywan 12 months ago

We already have an implementation. And I raised the point once more this week that it would be good to update the according tickets.

For all intents and purposes we'll proceed here as already discussed, I simply didn't have a chance to pick it up for unrelated reasons.

Actions #28

Updated by livdywan 12 months ago

  • Status changed from Workable to In Progress

cdywan wrote:

For all intents and purposes we'll proceed here as already discussed, I simply didn't have a chance to pick it up for unrelated reasons.

As discussed in #note-22 I added the cycleTime based on the "in progress" time observed in the journal https://github.com/openSUSE/backlogger/pull/15 and it looks something like this:

slo,team="QE Tools",status="Workable",title="Workable Backlog" count=11                                                 
slo,team="QE Tools",status="Feedback",title="In Feedback" count=12                                                      
leadTime,team="QE Tools",status="Resolved",title="Closed within last 60 days" count=25 leadTime=672.4589777777778 cycleTime=73.2654111111111
Actions #29

Updated by openqa_review 12 months ago

  • Due date set to 2023-03-15

Setting due date based on mean cycle time of SUSE QE Tools

Actions #30

Updated by livdywan 12 months ago

I proposed https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/500 in addition to deploying the script itself because we need an API to be available.

Actions #31

Updated by livdywan 12 months ago

Apparently I can manually invoke the script with sudo but Grafana won't recognize slo as a metric i.e. selecting it in FROM and count doesn't show up as a field either 🤔

sudo env REDMINE_API_KEY=... /etc/telegraf/scripts/backlogger/backlogger.py --output=influxdb /etc/telegraf/scripts/tools-backlog/queries.yaml

leadTime is absent in the output I see when running the script manually, but that's expected since I forgot to propose the PR for that: https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/33

Actions #32

Updated by okurz 12 months ago

On monitor.qa.suse.de journalctl -u telegraf says:

Mar 10 11:00:03 openqa-monitor telegraf[1335]: 2023-03-10T10:00:03Z E! [inputs.exec] Error in plugin: metric parse error: expected field at 1:20: "slo,team=\"QE Tools\",status=\"New\",tit>

You can execute as test yourself:

sudo telegraf --test --config /etc/telegraf/telegraf.d/slo.conf

but of course you can do that without sudo locally.

Actions #33

Updated by livdywan 12 months ago

  • Copied to action #125765: Make Telegraf errors visible in alert handling added
Actions #34

Updated by tinita 12 months ago

What I don't understand is why we are not using the approach as we discussed in our meeting and noted down in https://progress.opensuse.org/issues/121582#note-22

Of course the approach of calculating the cycle time in the python script and deliver it to grafana is easier, but it has drawbacks.

  1. We only get a 60 day time frame. We don't have the possibility to tell Grafana to show the cycle time in, let's say, january compared to february, or the first two weeks of january vs. weeks 3 and 4.

  2. If we picked a smaller timeframe than 60 days, then we wouldn't be able to let Grafana tell us the average of a bigger time frame. E.g. in the extreme case of "Week 1 has 1 ticket with cycle time 100 days, week 2 has 10 tickets with cycle time 10 days", what's the average cycle time of week 1 and 2? Grafana would show us 55 here, which is wrong, it would be ~18

Delivering the number of tickets and the sum of all their cycle times would enable us in Grafana to pick any time frame we want and get the average, calculated by Grafana itself.

Of course, the simple approach could be just enough for our needs, I just wanted to note the limits of it, so that we are not wondering about it later.

Actions #35

Updated by livdywan 12 months ago

tinita wrote:

Of course, the simple approach could be just enough for our needs, I just wanted to note the limits of it, so that we are not wondering about it later.

My comments should have given you the impression that I was following that approach to the best of my knowledge. If that's not the case I guess it needs to be fixed.

Actions #36

Updated by tinita 12 months ago

My comments should have given you the impression that I was following that approach to the best of my knowledge.

Right, I thought so, but taking a closer look, backlogger.py delivers the averages already, not the sums.

If that's not the case I guess it needs to be fixed.

Adding the sum of cycle and leadtimes should actually be easy, and then in Grafana we can figure out what we use as soon we have some data to play with, so I'll try to make a PR.

Actions #37

Updated by tinita 12 months ago

I just found out that https://progress.opensuse.org/issues.json?query_id=541 gives me 25 issues, but total_count is 119:

% curl -s "https://progress.opensuse.org/issues.json?query_id=541" | jq '.issues[] | .id'  | wc -l
25
% curl -s "https://progress.opensuse.org/issues.json?query_id=541" | jq '.total_count'
119

So we might also suffer from the default limit. Adding per_page=100 doesn't change anything, and it wouldn't be enough for our numbers anyway.

edit: Ah, it needs to be limit and not per_page for JSON queries. But still, the upper limit is 100.

Actions #38

Updated by livdywan 12 months ago

okurz wrote:

Mar 10 11:00:03 openqa-monitor telegraf[1335]: 2023-03-10T10:00:03Z E! [inputs.exec] Error in plugin: metric parse error: expected field at 1:20: "slo,team=\"QE Tools\",status=\"New\",tit>

I can't say I understand why the format is being rejected, since the way I read the docs double quotes should be fine. It does pass if I strip spaces from the values.

I prepared a branch with a GitHub workflow based on a minimal telegraf config file anyway, which maybe I should've added in the first place: https://github.com/openSUSE/backlogger/pull/16

Actions #39

Updated by tinita 12 months ago

I can't say I understand why the format is being rejected, since the way I read the docs double quotes should be fine. It does pass if I strip spaces from the values.

Ah, it looks like spaces (and other chars) must be escaped. I searched for space and found this: weather,location\ place=us-midwest temperature=82 1465839830100400200

Actions #40

Updated by tinita 12 months ago

Given that for 60 days we get a lot of issues and would have to iterate through pages to work around the maximum limit of 100, I would say the approach of just taking a timeframe of 1 day and then let Grafana do the rest for us would be better (and also save Redmine some CPU ;-)

I just added the leadTimeSum and cycleTimeSum in this PR: https://github.com/openSUSE/backlogger/pull/17

So my suggestion would be to adapt the 541 query to 1 day now, and once we have data in Grafana, we can construct the correct Grafana query.

Actions #41

Updated by livdywan 12 months ago

tinita wrote:

So my suggestion would be to adapt the 541 query to 1 day now, and once we have data in Grafana, we can construct the correct Grafana query.

We should use a new query then, since I did not create this one. But otherwise sounds good to me.

tinita wrote:

I can't say I understand why the format is being rejected, since the way I read the docs double quotes should be fine. It does pass if I strip spaces from the values.

Ah, it looks like spaces (and other chars) must be escaped. I searched for space and found this: weather,location\ place=us-midwest temperature=82 1465839830100400200

I still don't understand it but it seems escaping the space after the tags makes it work:

slo,host=tumbleweed.hel,status="Workable",team="Example",title="Workable Backlog"\ count=15 1678
Actions #42

Updated by tinita 12 months ago

I rather thought the spaces inside the fields should be escaped, e.g. title="Workable\ Backlog"

Actions #43

Updated by tinita 12 months ago

I added a query now: https://progress.opensuse.org/issues?query_id=773 QE tools team - closed yesterday

This way we also don't depend on at which exact time the query is run, as long as it is run once a day.

Actions #44

Updated by livdywan 11 months ago

https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/34

So it turns out the "last closed" query looked like this in markdown:

Closed yesterday | 0 | <1, >0 | 🔴

Actions #45

Updated by tinita 11 months ago

We should also investigate how to get better error messages from telegraf.
This was the error shown in the journal, when the redmine query returned a 403:

[inputs.exec] Error in plugin: exec: exit status 1 for command '/etc/telegraf/scripts/backlogger/backlogger.py --output=influxdb /etc/telegraf/scripts/tools-backlog/queries.yaml': Traceback (most recent call last):...

That's also all I could see when calling telegraf --test --config slo.conf locally.

Note that the 3 dots at the end are literal and not added by me. I did not see any further explaining error there, but the backlogger output clearly showed the 403. So telegraf is swallowing useful information about errors.

Actions #46

Updated by tinita 11 months ago

It seems that truncating the error message to the first line is a feature :(
https://github.com/influxdata/telegraf/issues/5415

Actions #47

Updated by tinita 11 months ago

As a side note, the backlog ticket counts can already be watched in this graph: https://stats.openqa-monitor.qa.suse.de/d/1pHb56Lnk/tinas-dashboard?from=now-2d&to=now&viewPanel=22

Actions #48

Updated by livdywan 11 months ago

Next 3 steps before putting this in Feedback:

  • Reintroduce the closed query to the QE Tools status
  • Reduce frequency in Grafana to daily
  • Save the new dashboard JSON in Salt
Actions #49

Updated by livdywan 11 months ago

cdywan wrote:

Next 3 steps before putting this in Feedback:

  • Reintroduce the closed query to the QE Tools status

https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/35

Actions #50

Updated by livdywan 11 months ago

cdywan wrote:

Next 3 steps before putting this in Feedback:

  • Reintroduce the closed query to the QE Tools status
  • Reduce frequency in Grafana to daily

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/810

Actions #51

Updated by tinita 11 months ago

A question about the data points we are feeding into influxdb.
This is the current output I get from --output=influxdb:

slo,team="QE\ Tools",status="New",title="Overall\ Backlog" count=5
slo,team="QE\ Tools",status="Workable",title="Overall\ Backlog" count=17
slo,team="QE\ Tools",status="In\ Progress",title="Overall\ Backlog" count=6
slo,team="QE\ Tools",status="Blocked",title="Overall\ Backlog" count=51
slo,team="QE\ Tools",status="Feedback",title="Overall\ Backlog" count=17
slo,team="QE\ Tools",status="Workable",title="Workable\ Backlog" count=17
slo,team="QE\ Tools",status="In\ Progress",title="In\ Progress" count=6
slo,team="QE\ Tools",status="Feedback",title="In\ Feedback" count=20

So we are getting certain numbers multiple times, and even different numbers, e.g. for Feedback we get 17 and 20.

I looked into the code and it is not really clear to me now why we would go over all of the queries in queries.yaml and sum up numbers per ticket status from all of them.

Actions #52

Updated by livdywan 11 months ago

  • Due date changed from 2023-03-15 to 2023-03-31
  • Status changed from In Progress to Feedback

cdywan wrote:

cdywan wrote:

Next 3 steps before putting this in Feedback:

  • Reintroduce the closed query to the QE Tools status

https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/35

This hasn't been merged yet because unfortunately we're again discussing the backlogger design in the downstream PR after having implemented it.

I looked into the code and it is not really clear to me now why we would go over all of the queries in queries.yaml and sum up numbers per ticket status from all of them

Ashley Average wants to monitor the number of tickets in Feedback in Grafana.

That's the user story. I don't know if it's efficient or free of bugs. The answer as to "Why" is that it's the easiest way and requires no configuration.

Actions #53

Updated by livdywan 11 months ago

cdywan wrote:

cdywan wrote:

Next 3 steps before putting this in Feedback:

  • Reintroduce the closed query to the QE Tools status
  • Reduce frequency in Grafana to daily

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/810

The change has been deployed but it doesn't seem to be effective...

sudo grep interval /etc/telegraf/telegraf.d/slo.conf                                                  
  interval = "1d"
Actions #54

Updated by okurz 11 months ago

  • Copied to action #126113: [tools][metrics] Only show queries in backlogger output that are relevant for the according output mode added
Actions #55

Updated by livdywan 11 months ago

cdywan wrote:

cdywan wrote:

Next 3 steps before putting this in Feedback:

  • Reintroduce the closed query to the QE Tools status

https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/35

Merged.

Let's give it time and prepare the new panel next week.

Oli will file a ticket on the output configuration.

Actions #56

Updated by livdywan 11 months ago

error loading config file /etc/telegraf/telegraf.d/slo.conf: error parsing exec, line 1:{0 286}: error parsing duration: time: unknown unit "d" in duration "1d"

Apparently 1d is not a valid unit for the interval: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/813

Actions #57

Updated by tinita 11 months ago

I was thinking about the interval and possibly duplicate entries (if the query would run multiple times per day) and found this:
https://docs.influxdata.com/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points

So in order to not lose data if one query fails (and we only make one query per day), we could add the unique timestamp of the data (e.g. use the time 00:00:00 of the day the query is made) to the output and then run the query every 8 hours for example. Influxdb would still see it as one data point if they all have the same timestamp.

That strategy can also be used once we feed data for every individual ticket to influxdb.

Actions #58

Updated by tinita 11 months ago

https://github.com/openSUSE/backlogger/pull/21 Add timestamp in nanoseconds for influxdb output

Actions #59

Updated by tinita 11 months ago

  • Description updated (diff)

We talked about adding also older data to influxdb, this should also be easy by using an according query. But for this we need to deliver the correct timestamp instead of just "today" like in my pull request.

Actions #60

Updated by tinita 11 months ago

It seems this morning the telegraf query ran into a timeout, so it would be good to run the query more often again (as soon as we have merged the PR with the timestamp).

Mar 28 02:00:10 openqa-monitor telegraf[1315]: 2023-03-28T00:00:10Z E! [inputs.exec] Error in plugin: exec: command timed out for command '/etc/telegraf/scripts/backlogger/backlogger.py --output=influxdb /etc/telegraf/scripts/tools-backlog/queries.yaml':

It would also be good to have an option to feed data into influxdb for a certain past timeframe, like I mrntioned above. Maybe this should be a script on its own rather than putting all that functionality in the backlogger because we don't need all of this for the backlog status page.

Actions #61

Updated by livdywan 11 months ago

tinita wrote:

It would also be good to have an option to feed data into influxdb for a certain past timeframe, like I mrntioned above. Maybe this should be a script on its own rather than putting all that functionality in the backlogger because we don't need all of this for the backlog status page.

Yes wrt refactoring into classes or separate projects. However please file a new ticket since imho this ticket is getting too big if we're including historical data and covering Redmine outages.

Actions #62

Updated by livdywan 11 months ago

Discussed it briefly. We'll consider this resolved once 1) Tina's PR is merged 2) the frequency is hourly again 3) JSON for Tina's dashboard is in salt.

Actions #63

Updated by okurz 11 months ago

What we discussed some days ago what I see as missing from https://stats.openqa-monitor.qa.suse.de/d/ck8uu5f4z/agile?orgId=1&refresh=30m:

  1. median values
  2. units, like days/hours
Actions #64

Updated by tinita 11 months ago

We decided that strictly the AC will be fulfilled with the points @cdywan mentioned, and we can see some numbers for now, and wanted to split improvements to a new ticket.

My points would be:

  • Feed individual ticket data to influxdb in order to calculate median
  • Improve script to be able to print data for a certain day or timeframe (maybe split to its own script to not adding stuff that backlogger itself does not really need)

Should I create a new ticket?

Actions #65

Updated by okurz 11 months ago

tinita wrote:

We decided that strictly the AC will be fulfilled with the points @cdywan mentioned, and we can see some numbers for now, and wanted to split improvements to a new ticket.

My points would be:

  • Feed individual ticket data to influxdb in order to calculate median
  • Improve script to be able to print data for a certain day or timeframe (maybe split to its own script to not adding stuff that backlogger itself does not really need)

Should I create a new ticket?

Well, either the suggested points are handled in this ticket or another one. I don't mind either way. Your choice

Actions #66

Updated by tinita 11 months ago

Actions #68

Updated by livdywan 11 months ago

cdywan wrote:

  • Save the new dashboard JSON in Salt

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/824

Actions #69

Updated by tinita 11 months ago

  • Copied to action #127025: [tools][metrics] Improve cycle + lead times in Grafana added
Actions #70

Updated by livdywan 11 months ago

We have a working dashboard with metrics in it so I consider the AC's fulfilled. See #127025 for the suggested follow-up.

Actions #71

Updated by livdywan 11 months ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF