action #121582: [tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously size:M - QA (public) - openSUSE Project Management Tool

Actions

Copy link

#2

Updated by livdywan over 2 years ago

Subject changed from [tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously to [tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously size:M
Description updated (diff)
Status changed from New to Workable

Actions

Copy link

#3

Updated by livdywan over 2 years ago

Status changed from Workable to In Progress
Assignee set to livdywan

I'm taking a look. Not sure what existing scripts there might actually be, but I'll try to ask a couple more people to see what I can re-use.

Actions

Copy link

#4

Updated by openqa_review over 2 years ago

Due date set to 2022-12-29

Setting due date based on mean cycle time of SUSE QE Tools

Actions

Copy link

#5

Updated by tinita over 2 years ago

This project from Ivan might be helpful: https://github.com/ilausuch/redmine_statistics

Actions

Copy link

#6

Updated by livdywan over 2 years ago

tinita wrote:

This project from Ivan might be helpful: https://github.com/ilausuch/redmine_statistics

Thanks for suggesting it. I also looked at stunning-octo-chainsaw (which measures numbers from GitHub if you couldn't guess from its name). Actually after taking a closer look and playing with it I realized we don't need to compute anything. Grafana will do that based on InfluxDB line protocol which periodically consumes total counts.

The backlog status already has the queries. Let's re-use them by making it another output mode.
The backlogger needs to run as part of our salt deployment. I'm using git.latest to fetch the code in order to avoid duplicating things within salt states.
Of course this only falls into place once we're using the backlogger instead of the old fork.

Output currently looks like so:

slo,team="QE Tools",title="Overall Backlog" count=91
slo,team="QE Tools",title="Workable Backlog" count=12
slo,team="QE Tools",title="Exceeding Due Date" count=0
slo,team="QE Tools",title="Untriaged QA" count=0
slo,team="QE Tools",title="Untriaged Tools Tagged" count=0
slo,team="QE Tools",title="SLO immediate (<1 day)" count=0
slo,team="QE Tools",title="SLO urgent (<1 week)" count=0
slo,team="QE Tools",title="SLO high (<1 month)" count=1
slo,team="QE Tools",title="SLO normal (<1 year)" count=0
slo,team="QE Tools",title="In Progress" count=5
slo,team="QE Tools",title="In Feedback" count=13

Actions

Copy link

#7

Updated by livdywan over 2 years ago

Due date changed from 2022-12-29 to 2023-01-13
Status changed from In Progress to Feedback

Will wait on feedback for now, with all pieces in place. Feel free to comment on the sample output to confirm if this is what we want or if we want other numbers. Not going to work on it til next year, though.

Actions

Copy link

#8

Updated by okurz over 2 years ago

I don't know how the above give cycle and lead times. Can you explain?

Actions

Copy link

#9

Updated by livdywan over 2 years ago

okurz wrote:

I don't know how the above give cycle and lead times. Can you explain?

Let's take an example. slo,team="QE Tools",title="Workable Backlog" count=12 tells us that the Workable Backlog has 12 issues in it. This is updated hourly (configured in salt). I've not prepared Grafana dashboards yet that would render this into a graph.

I assume this is what we want. If not please explain by example what you might be expecting.

Actions

Copy link

#10

Updated by tinita over 2 years ago

https://kanbanize.com/kanban-resources/kanban-software/kanban-lead-cycle-time

That's why I posted a link to Ivan's code, I think it might help

Actions

Copy link

#11

Updated by okurz over 2 years ago

@cdywan see the definition from the link that tinita posted

Actions

Copy link

#12

Updated by okurz over 2 years ago

Due date changed from 2023-01-13 to 2023-01-20

christmas grace due date bump :)

Actions

Copy link

#13

Updated by livdywan over 2 years ago

Here's an example including additional fields analoguous to what's implemented in the redmine_statistics project. I was hoping to get some early feedback on the minimal approach, but I guess I'm keeping it in the same branch now:

slo,team="QE Tools",status="New",title="Overall Backlog" count=6 avg=4.231111111111111 med=2.4391666666666665 std=48.9769975
slo,team="QE Tools",status="Workable",title="Overall Backlog" count=10 avg=15.109333333333334 med=18.405833333333334 std=49.890591755829895
slo,team="QE Tools",status="In Progress",title="Overall Backlog" count=4 avg=3.396111111111111 med=3.2463888888888888 std=6.536649537037036
slo,team="QE Tools",status="Blocked",title="Overall Backlog" count=5 avg=11.00961111111111 med=7.204722222222222 std=91.1066115200617
slo,team="QE Tools",status="Workable",title="Workable Backlog" count=10 avg=15.109333333333334 med=18.405833333333334 std=49.890591755829895
slo,team="QE Tools",status="Feedback",title="In Feedback" count=13 avg=6.791153846153847 med=3.6266666666666665 std=69.11117296612852

Actions

Copy link

#14

Updated by livdywan over 2 years ago

Status changed from Feedback to In Progress

This is in progress.

Actions

Copy link

#16

Updated by okurz over 2 years ago

Related to action #47891: [functional][u] Continuously update the QSF-u team charts (cycle time, etc.) to know how we perform added

Actions

Copy link

#17

Updated by okurz over 2 years ago

I looked up the original tickets #43442 and #47891, according code repo https://github.com/DrMullings/Scripts-Snippets-Stuff

Actions

Copy link

#18

Updated by okurz over 2 years ago

As discussed "lead time" could be implemented by looking at "time_when_resolved - time_when_created", "cycle time" could be "time_when_resolved - time_when_assigned" which has the drawback that tickets that are assigned but not actively worked on also account for the cycle time but I consider those rare exceptions that maybe we can then avoid if the cycle time alerts us about those. An alternative is to sum up all times when ticket is in progress or feedback minus ticket in new or workable but I would leave that out for now.

Actions

Copy link

#19

Updated by livdywan over 2 years ago

Talked about it briefly. @tinita raised the point that we should ideally measure each period of time a ticket is in progress, and scripts snippets stuff doesn't seem to handle that. I was taking a look at the journal data before, although it's not included in my branch so far, and I think that's doable.

Actions

Copy link

#20

Updated by livdywan over 2 years ago

okurz wrote:

As discussed "lead time" could be implemented by looking at "time_when_resolved - time_when_created", "cycle time" could be "time_when_resolved - time_when_assigned" which has the drawback that tickets that are assigned but not actively worked on also account for the cycle time

Right. That includes the user story "Kim is assigning themself to a ticket a few days before actively working on it".

Actions

Copy link

#21

Updated by livdywan over 2 years ago

Due date changed from 2023-01-20 to 2023-01-27
Status changed from In Progress to Workable

I'm not blocked here, but simply couldn't make time to work on the last step because of other tasks. The integration within Grafana also still needs to be tested, so even then we'd want to allow people to verify that we see according data on the dashboard.

Maybe I should for now make it Workable in case somebody else has spare cycles and is interested in working on it. Otherwise I plan to pick it up again next week.

Actions

Copy link

#22

Updated by livdywan about 2 years ago

Tags deleted (~~mob~~)

To figure out how the data needs to look Tina and I took an example and analyzed it from the raw data to the query used in a Grafana dashboard:

https://openqa.suse.de/admin/influxdb/jobs contains rows such as this one:
- openqa_jobs_by_worker,url=https://openqa.suse.de,worker=worker5 running=18i
- You can see that worker5 at one point in time has 18 running jobs
- https://stats.openqa-monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1&viewPanel=10&editPanel=10 processes this data
- For different intervals the dashboard shows the average number of running jobs per worker

Transferring this to the context of cycle time we want 1) the number of resolved tickets and 2) the sum of time those tickets spent "in progress", calculated in the backlogger code. An example of multiple days, assuming we feed data daily and look at the data from the last day, could look like so:

day1: count_resolved=1 sum_cycle_time=14
day2: count_resolved=0 sum_cycle_time=0 (no data)
day3: count_resolved=2 sum_cycle_time=10

Grafana will process this data and make available the average/median/whatever we choose in a query for a given time span.

Actions

Copy link

#23

Updated by livdywan about 2 years ago

Due date deleted (~~2023-01-27~~)

Due date is not generally applicable to workable

Actions

Copy link

#25

Updated by livdywan about 2 years ago

Apparently @ilausuch is now pursuing some grand plan to feed all Redmine and Bugzilla tickets into Grafana and make this accessible to all teams, see #123541. So we may not want to duplicate the effort at this point?

Actions

Copy link

#26

Updated by okurz about 2 years ago

cdywan wrote:

Apparently @ilausuch is now pursuing some grand plan to feed all Redmine and Bugzilla tickets into Grafana and make this accessible to all teams, see #123541. So we may not want to duplicate the effort at this point?

I would be happy to learn more about that grand plan and how the QAC squad supports that. As long as that is just a private, personal idea I don't think we can rely on it. I would really prefer a simple approach for now that also provides something useful without relying on data in grafana. Like https://os-autoinst.github.io/qa-tools-backlog-assistant/ showing the cycle number?

Actions

Copy link

#27

Updated by livdywan about 2 years ago

We already have an implementation. And I raised the point once more this week that it would be good to update the according tickets.

For all intents and purposes we'll proceed here as already discussed, I simply didn't have a chance to pick it up for unrelated reasons.

Actions

Copy link

#28

Updated by livdywan about 2 years ago

Status changed from Workable to In Progress

cdywan wrote:

For all intents and purposes we'll proceed here as already discussed, I simply didn't have a chance to pick it up for unrelated reasons.

As discussed in #note-22 I added the cycleTime based on the "in progress" time observed in the journal https://github.com/openSUSE/backlogger/pull/15 and it looks something like this:

slo,team="QE Tools",status="Workable",title="Workable Backlog" count=11                                                 
slo,team="QE Tools",status="Feedback",title="In Feedback" count=12                                                      
leadTime,team="QE Tools",status="Resolved",title="Closed within last 60 days" count=25 leadTime=672.4589777777778 cycleTime=73.2654111111111

Actions

Copy link

#29

Updated by openqa_review about 2 years ago

Due date set to 2023-03-15

Setting due date based on mean cycle time of SUSE QE Tools

Actions

Copy link

#30

Updated by livdywan about 2 years ago

I proposed https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/500 in addition to deploying the script itself because we need an API to be available.

Actions

Copy link

#31

Updated by livdywan about 2 years ago

Apparently I can manually invoke the script with sudo but Grafana won't recognize slo as a metric i.e. selecting it in FROM and count doesn't show up as a field either 🤔

sudo env REDMINE_API_KEY=... /etc/telegraf/scripts/backlogger/backlogger.py --output=influxdb /etc/telegraf/scripts/tools-backlog/queries.yaml

leadTime is absent in the output I see when running the script manually, but that's expected since I forgot to propose the PR for that: https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/33

Actions

Copy link

#32

Updated by okurz about 2 years ago

On monitor.qa.suse.de journalctl -u telegraf says:

Mar 10 11:00:03 openqa-monitor telegraf[1335]: 2023-03-10T10:00:03Z E! [inputs.exec] Error in plugin: metric parse error: expected field at 1:20: "slo,team=\"QE Tools\",status=\"New\",tit>

You can execute as test yourself:

sudo telegraf --test --config /etc/telegraf/telegraf.d/slo.conf

but of course you can do that without sudo locally.

Actions

Copy link

#33

Updated by livdywan about 2 years ago

Copied to action #125765: Make Telegraf errors visible in alert handling added

Actions

Copy link

#34

Updated by tinita about 2 years ago

What I don't understand is why we are not using the approach as we discussed in our meeting and noted down in https://progress.opensuse.org/issues/121582#note-22

Of course the approach of calculating the cycle time in the python script and deliver it to grafana is easier, but it has drawbacks.

We only get a 60 day time frame. We don't have the possibility to tell Grafana to show the cycle time in, let's say, january compared to february, or the first two weeks of january vs. weeks 3 and 4.
If we picked a smaller timeframe than 60 days, then we wouldn't be able to let Grafana tell us the average of a bigger time frame. E.g. in the extreme case of "Week 1 has 1 ticket with cycle time 100 days, week 2 has 10 tickets with cycle time 10 days", what's the average cycle time of week 1 and 2? Grafana would show us 55 here, which is wrong, it would be ~18

Delivering the number of tickets and the sum of all their cycle times would enable us in Grafana to pick any time frame we want and get the average, calculated by Grafana itself.

Of course, the simple approach could be just enough for our needs, I just wanted to note the limits of it, so that we are not wondering about it later.

Actions

Copy link

#35

Updated by livdywan about 2 years ago

tinita wrote:

Of course, the simple approach could be just enough for our needs, I just wanted to note the limits of it, so that we are not wondering about it later.

My comments should have given you the impression that I was following that approach to the best of my knowledge. If that's not the case I guess it needs to be fixed.

Actions

Copy link

#36

Updated by tinita about 2 years ago

My comments should have given you the impression that I was following that approach to the best of my knowledge.

Right, I thought so, but taking a closer look, backlogger.py delivers the averages already, not the sums.

If that's not the case I guess it needs to be fixed.

Adding the sum of cycle and leadtimes should actually be easy, and then in Grafana we can figure out what we use as soon we have some data to play with, so I'll try to make a PR.

Actions

Copy link

#37

Updated by tinita about 2 years ago

I just found out that https://progress.opensuse.org/issues.json?query_id=541 gives me 25 issues, but total_count is 119:

% curl -s "https://progress.opensuse.org/issues.json?query_id=541" | jq '.issues[] | .id'  | wc -l
25
% curl -s "https://progress.opensuse.org/issues.json?query_id=541" | jq '.total_count'
119

So we might also suffer from the default limit. Adding per_page=100 doesn't change anything, and it wouldn't be enough for our numbers anyway.

edit: Ah, it needs to be limit and not per_page for JSON queries. But still, the upper limit is 100.

Actions

Copy link

#38

Updated by livdywan about 2 years ago

okurz wrote:

Mar 10 11:00:03 openqa-monitor telegraf[1335]: 2023-03-10T10:00:03Z E! [inputs.exec] Error in plugin: metric parse error: expected field at 1:20: "slo,team=\"QE Tools\",status=\"New\",tit>

I can't say I understand why the format is being rejected, since the way I read the docs double quotes should be fine. It does pass if I strip spaces from the values.

I prepared a branch with a GitHub workflow based on a minimal telegraf config file anyway, which maybe I should've added in the first place: https://github.com/openSUSE/backlogger/pull/16

Actions

Copy link

#39

Updated by tinita about 2 years ago

I can't say I understand why the format is being rejected, since the way I read the docs double quotes should be fine. It does pass if I strip spaces from the values.

Ah, it looks like spaces (and other chars) must be escaped. I searched for space and found this: weather,location\ place=us-midwest temperature=82 1465839830100400200

Actions

Copy link

#40

Updated by tinita about 2 years ago

Given that for 60 days we get a lot of issues and would have to iterate through pages to work around the maximum limit of 100, I would say the approach of just taking a timeframe of 1 day and then let Grafana do the rest for us would be better (and also save Redmine some CPU ;-)

I just added the leadTimeSum and cycleTimeSum in this PR: https://github.com/openSUSE/backlogger/pull/17

So my suggestion would be to adapt the 541 query to 1 day now, and once we have data in Grafana, we can construct the correct Grafana query.

Actions

Copy link

#41

Updated by livdywan about 2 years ago

tinita wrote:

So my suggestion would be to adapt the 541 query to 1 day now, and once we have data in Grafana, we can construct the correct Grafana query.

We should use a new query then, since I did not create this one. But otherwise sounds good to me.

tinita wrote:

I can't say I understand why the format is being rejected, since the way I read the docs double quotes should be fine. It does pass if I strip spaces from the values.

Ah, it looks like spaces (and other chars) must be escaped. I searched for space and found this: weather,location\ place=us-midwest temperature=82 1465839830100400200

I still don't understand it but it seems escaping the space after the tags makes it work:

slo,host=tumbleweed.hel,status="Workable",team="Example",title="Workable Backlog"\ count=15 1678

Actions

Copy link

#42

Updated by tinita about 2 years ago

I rather thought the spaces inside the fields should be escaped, e.g. title="Workable\ Backlog"

Actions

Copy link

#43

Updated by tinita about 2 years ago

I added a query now: https://progress.opensuse.org/issues?query_id=773 QE tools team - closed yesterday

This way we also don't depend on at which exact time the query is run, as long as it is run once a day.

Actions

Copy link

#44

Updated by livdywan about 2 years ago

https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/34

So it turns out the "last closed" query looked like this in markdown:

Closed yesterday | 0 | <1, >0 | 🔴

Actions

Copy link

#45

Updated by tinita about 2 years ago

We should also investigate how to get better error messages from telegraf.
This was the error shown in the journal, when the redmine query returned a 403:

[inputs.exec] Error in plugin: exec: exit status 1 for command '/etc/telegraf/scripts/backlogger/backlogger.py --output=influxdb /etc/telegraf/scripts/tools-backlog/queries.yaml': Traceback (most recent call last):...

That's also all I could see when calling telegraf --test --config slo.conf locally.

Note that the 3 dots at the end are literal and not added by me. I did not see any further explaining error there, but the backlogger output clearly showed the 403. So telegraf is swallowing useful information about errors.

Actions

Copy link

#46

Updated by tinita about 2 years ago

It seems that truncating the error message to the first line is a feature :(
https://github.com/influxdata/telegraf/issues/5415

Actions

Copy link

#47

Updated by tinita about 2 years ago

As a side note, the backlog ticket counts can already be watched in this graph: https://stats.openqa-monitor.qa.suse.de/d/1pHb56Lnk/tinas-dashboard?from=now-2d&to=now&viewPanel=22

Actions

Copy link

#48

Updated by livdywan about 2 years ago

Next 3 steps before putting this in Feedback:

Reintroduce the closed query to the QE Tools status
Reduce frequency in Grafana to daily
Save the new dashboard JSON in Salt

Actions

Copy link

#49

Updated by livdywan about 2 years ago

cdywan wrote:

Next 3 steps before putting this in Feedback:

Reintroduce the closed query to the QE Tools status

https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/35

Actions

Copy link

#50

Updated by livdywan about 2 years ago

cdywan wrote:

Next 3 steps before putting this in Feedback:

Reintroduce the closed query to the QE Tools status

Reduce frequency in Grafana to daily

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/810

Actions

Copy link

#51

Updated by tinita about 2 years ago

A question about the data points we are feeding into influxdb.
This is the current output I get from --output=influxdb:

slo,team="QE\ Tools",status="New",title="Overall\ Backlog" count=5
slo,team="QE\ Tools",status="Workable",title="Overall\ Backlog" count=17
slo,team="QE\ Tools",status="In\ Progress",title="Overall\ Backlog" count=6
slo,team="QE\ Tools",status="Blocked",title="Overall\ Backlog" count=51
slo,team="QE\ Tools",status="Feedback",title="Overall\ Backlog" count=17
slo,team="QE\ Tools",status="Workable",title="Workable\ Backlog" count=17
slo,team="QE\ Tools",status="In\ Progress",title="In\ Progress" count=6
slo,team="QE\ Tools",status="Feedback",title="In\ Feedback" count=20

So we are getting certain numbers multiple times, and even different numbers, e.g. for Feedback we get 17 and 20.

I looked into the code and it is not really clear to me now why we would go over all of the queries in queries.yaml and sum up numbers per ticket status from all of them.

Actions

Copy link

#52

Updated by livdywan about 2 years ago

Due date changed from 2023-03-15 to 2023-03-31
Status changed from In Progress to Feedback

cdywan wrote:

cdywan wrote:

Next 3 steps before putting this in Feedback:

Reintroduce the closed query to the QE Tools status

https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/35

This hasn't been merged yet because unfortunately we're again discussing the backlogger design in the downstream PR after having implemented it.

Maybe we want to omit unrestricted queries afterall.
Maybe we want separate queries.yaml files.
Specifying output in queries.yaml has also been suggested.

I looked into the code and it is not really clear to me now why we would go over all of the queries in queries.yaml and sum up numbers per ticket status from all of them

Ashley Average wants to monitor the number of tickets in Feedback in Grafana.

That's the user story. I don't know if it's efficient or free of bugs. The answer as to "Why" is that it's the easiest way and requires no configuration.

Actions

Copy link

#53

Updated by livdywan about 2 years ago

cdywan wrote:

cdywan wrote:

Next 3 steps before putting this in Feedback:

Reintroduce the closed query to the QE Tools status

Reduce frequency in Grafana to daily

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/810

The change has been deployed but it doesn't seem to be effective...

sudo grep interval /etc/telegraf/telegraf.d/slo.conf                                                  
  interval = "1d"

Actions

Copy link

#54

Updated by okurz about 2 years ago

Copied to action #126113: [tools][metrics] Only show queries in backlogger output that are relevant for the according output mode added

Actions

Copy link

#55

Updated by livdywan about 2 years ago

cdywan wrote:

cdywan wrote:

Next 3 steps before putting this in Feedback:

Reintroduce the closed query to the QE Tools status

https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/35

Merged.

Let's give it time and prepare the new panel next week.

Oli will file a ticket on the output configuration.

Actions

Copy link

#56

Updated by livdywan about 2 years ago

error loading config file /etc/telegraf/telegraf.d/slo.conf: error parsing exec, line 1:{0 286}: error parsing duration: time: unknown unit "d" in duration "1d"

Apparently 1d is not a valid unit for the interval: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/813

Actions

Copy link

#57

Updated by tinita about 2 years ago

I was thinking about the interval and possibly duplicate entries (if the query would run multiple times per day) and found this:
https://docs.influxdata.com/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points

So in order to not lose data if one query fails (and we only make one query per day), we could add the unique timestamp of the data (e.g. use the time 00:00:00 of the day the query is made) to the output and then run the query every 8 hours for example. Influxdb would still see it as one data point if they all have the same timestamp.

That strategy can also be used once we feed data for every individual ticket to influxdb.

Actions

Copy link

#58

Updated by tinita about 2 years ago

https://github.com/openSUSE/backlogger/pull/21 Add timestamp in nanoseconds for influxdb output

Actions

Copy link

#59

Updated by tinita about 2 years ago

Description updated (diff)

We talked about adding also older data to influxdb, this should also be easy by using an according query. But for this we need to deliver the correct timestamp instead of just "today" like in my pull request.

Actions

Copy link

#60

Updated by tinita about 2 years ago

It seems this morning the telegraf query ran into a timeout, so it would be good to run the query more often again (as soon as we have merged the PR with the timestamp).

Mar 28 02:00:10 openqa-monitor telegraf[1315]: 2023-03-28T00:00:10Z E! [inputs.exec] Error in plugin: exec: command timed out for command '/etc/telegraf/scripts/backlogger/backlogger.py --output=influxdb /etc/telegraf/scripts/tools-backlog/queries.yaml':

It would also be good to have an option to feed data into influxdb for a certain past timeframe, like I mrntioned above. Maybe this should be a script on its own rather than putting all that functionality in the backlogger because we don't need all of this for the backlog status page.

Actions

Copy link

#61

Updated by livdywan about 2 years ago

tinita wrote:

It would also be good to have an option to feed data into influxdb for a certain past timeframe, like I mrntioned above. Maybe this should be a script on its own rather than putting all that functionality in the backlogger because we don't need all of this for the backlog status page.

Yes wrt refactoring into classes or separate projects. However please file a new ticket since imho this ticket is getting too big if we're including historical data and covering Redmine outages.

Actions

Copy link

#62

Updated by livdywan about 2 years ago

Discussed it briefly. We'll consider this resolved once 1) Tina's PR is merged 2) the frequency is hourly again 3) JSON for Tina's dashboard is in salt.

Actions

Copy link

#63

Updated by okurz about 2 years ago

What we discussed some days ago what I see as missing from https://stats.openqa-monitor.qa.suse.de/d/ck8uu5f4z/agile?orgId=1&refresh=30m:

median values
units, like days/hours

Actions

Copy link

#64

Updated by tinita about 2 years ago

We decided that strictly the AC will be fulfilled with the points @cdywan mentioned, and we can see some numbers for now, and wanted to split improvements to a new ticket.

My points would be:

Feed individual ticket data to influxdb in order to calculate median
Improve script to be able to print data for a certain day or timeframe (maybe split to its own script to not adding stuff that backlogger itself does not really need)

Should I create a new ticket?

Actions

Copy link

#65

Updated by okurz about 2 years ago

tinita wrote:

We decided that strictly the AC will be fulfilled with the points @cdywan mentioned, and we can see some numbers for now, and wanted to split improvements to a new ticket.

My points would be:

Feed individual ticket data to influxdb in order to calculate median

Improve script to be able to print data for a certain day or timeframe (maybe split to its own script to not adding stuff that backlogger itself does not really need)

Should I create a new ticket?

Well, either the suggested points are handled in this ticket or another one. I don't mind either way. Your choice

Actions

Copy link

#66

Updated by tinita about 2 years ago

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/823 Decrease interval for backlogger query

Actions

Copy link

#67

Updated by tinita about 2 years ago

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/823 merged, I restarted telegraf

Actions

Copy link

#68

Updated by livdywan about 2 years ago

cdywan wrote:

Save the new dashboard JSON in Salt

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/824

Actions

Copy link

#69

Updated by tinita about 2 years ago

Copied to action #127025: [tools][metrics] Improve cycle + lead times in Grafana added

Actions

Copy link

#70

Updated by livdywan about 2 years ago

We have a working dashboard with metrics in it so I consider the AC's fulfilled. See #127025 for the suggested follow-up.

Actions

Copy link

#71

Updated by livdywan about 2 years ago

Status changed from Feedback to Resolved

Project

General

Profile

QA (public)

Tags

Custom queries

action #121582

[tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously size:M

Motivation¶

Acceptance criteria¶

Suggestions¶

Updated by livdywan over 2 years ago

Updated by livdywan over 2 years ago

Updated by openqa_review over 2 years ago

Updated by tinita over 2 years ago

Updated by livdywan over 2 years ago

Updated by livdywan over 2 years ago

Updated by okurz over 2 years ago

Updated by livdywan over 2 years ago

Updated by tinita over 2 years ago

Updated by okurz over 2 years ago

Updated by okurz over 2 years ago

Updated by livdywan over 2 years ago

Updated by livdywan over 2 years ago

Updated by okurz over 2 years ago

Updated by okurz over 2 years ago

Updated by okurz over 2 years ago

Updated by livdywan over 2 years ago

Updated by livdywan over 2 years ago

Updated by livdywan over 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago

Updated by okurz about 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago

Updated by openqa_review about 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago

Updated by okurz about 2 years ago

Updated by livdywan about 2 years ago

Updated by tinita about 2 years ago

Updated by livdywan about 2 years ago

Updated by tinita about 2 years ago

Updated by tinita about 2 years ago

Updated by livdywan about 2 years ago

Updated by tinita about 2 years ago

Updated by tinita about 2 years ago

Updated by livdywan about 2 years ago

Updated by tinita about 2 years ago

Updated by tinita about 2 years ago

Updated by livdywan about 2 years ago

Updated by tinita about 2 years ago

Updated by tinita about 2 years ago

Updated by tinita about 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago

Updated by tinita about 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago

Updated by okurz about 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago

Updated by tinita about 2 years ago

Updated by tinita about 2 years ago

Updated by tinita about 2 years ago

Updated by tinita about 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago

Updated by okurz about 2 years ago

Updated by tinita about 2 years ago

Updated by okurz about 2 years ago

Updated by tinita about 2 years ago

Updated by tinita about 2 years ago

Updated by livdywan about 2 years ago

Updated by tinita about 2 years ago

Updated by livdywan about 2 years ago

Updated by livdywan about 2 years ago