action #121582
closed[tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously size:M
0%
Description
Motivation¶
We collected cycle and lead times already in the past with some scripts. Then it was added to monitor.qa.suse.de but eventually stopped working and nobody resurrected it yet. LSG QE tracks metrics and tries to expand, see #118135 so we should contribute to that by bringing back cycle and lead time evaluations that can help us in our daily work.
Acceptance criteria¶
- AC1: Up-to-date cycle and lead times for SUSE QE Tools can be found over monitor.qa.suse.de
Suggestions¶
- See https://progress.opensuse.org/projects/qa/wiki/Tools#Target-numbers-or-guideline-should-be-in-priorities
- See how to come up with ticket counts, maybe per priority, to push into influxdb
- Try and dig up the existing script and see how that worked
- Look into Redmine API to find out how to get numbers that can be fed into influxdb
- Take a look at the Redmine REST API docs (https://www.redmine.org/projects/redmine/wiki/rest_api)
- Maybe start with a script using the API to get the right data, and then think about how to get the data into Grafana
- Add also older data to influxdb
Updated by livdywan about 2 years ago
- Subject changed from [tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously to [tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by livdywan about 2 years ago
- Status changed from Workable to In Progress
- Assignee set to livdywan
I'm taking a look. Not sure what existing scripts there might actually be, but I'll try to ask a couple more people to see what I can re-use.
Updated by openqa_review about 2 years ago
- Due date set to 2022-12-29
Setting due date based on mean cycle time of SUSE QE Tools
Updated by tinita about 2 years ago
This project from Ivan might be helpful: https://github.com/ilausuch/redmine_statistics
Updated by livdywan almost 2 years ago
tinita wrote:
This project from Ivan might be helpful: https://github.com/ilausuch/redmine_statistics
Thanks for suggesting it. I also looked at stunning-octo-chainsaw (which measures numbers from GitHub if you couldn't guess from its name). Actually after taking a closer look and playing with it I realized we don't need to compute anything. Grafana will do that based on InfluxDB line protocol which periodically consumes total counts.
- The backlog status already has the queries. Let's re-use them by making it another output mode.
- The backlogger needs to run as part of our salt deployment. I'm using git.latest to fetch the code in order to avoid duplicating things within salt states.
- Of course this only falls into place once we're using the backlogger instead of the old fork.
Output currently looks like so:
slo,team="QE Tools",title="Overall Backlog" count=91
slo,team="QE Tools",title="Workable Backlog" count=12
slo,team="QE Tools",title="Exceeding Due Date" count=0
slo,team="QE Tools",title="Untriaged QA" count=0
slo,team="QE Tools",title="Untriaged Tools Tagged" count=0
slo,team="QE Tools",title="SLO immediate (<1 day)" count=0
slo,team="QE Tools",title="SLO urgent (<1 week)" count=0
slo,team="QE Tools",title="SLO high (<1 month)" count=1
slo,team="QE Tools",title="SLO normal (<1 year)" count=0
slo,team="QE Tools",title="In Progress" count=5
slo,team="QE Tools",title="In Feedback" count=13
Updated by livdywan almost 2 years ago
- Due date changed from 2022-12-29 to 2023-01-13
- Status changed from In Progress to Feedback
Will wait on feedback for now, with all pieces in place. Feel free to comment on the sample output to confirm if this is what we want or if we want other numbers. Not going to work on it til next year, though.
Updated by okurz almost 2 years ago
I don't know how the above give cycle and lead times. Can you explain?
Updated by livdywan almost 2 years ago
okurz wrote:
I don't know how the above give cycle and lead times. Can you explain?
Let's take an example. slo,team="QE Tools",title="Workable Backlog" count=12
tells us that the Workable Backlog has 12 issues in it. This is updated hourly (configured in salt). I've not prepared Grafana dashboards yet that would render this into a graph.
I assume this is what we want. If not please explain by example what you might be expecting.
Updated by tinita almost 2 years ago
https://kanbanize.com/kanban-resources/kanban-software/kanban-lead-cycle-time
That's why I posted a link to Ivan's code, I think it might help
Updated by okurz almost 2 years ago
@cdywan see the definition from the link that tinita posted
Updated by okurz almost 2 years ago
- Due date changed from 2023-01-13 to 2023-01-20
christmas grace due date bump :)
Updated by livdywan almost 2 years ago
Here's an example including additional fields analoguous to what's implemented in the redmine_statistics project. I was hoping to get some early feedback on the minimal approach, but I guess I'm keeping it in the same branch now:
slo,team="QE Tools",status="New",title="Overall Backlog" count=6 avg=4.231111111111111 med=2.4391666666666665 std=48.9769975
slo,team="QE Tools",status="Workable",title="Overall Backlog" count=10 avg=15.109333333333334 med=18.405833333333334 std=49.890591755829895
slo,team="QE Tools",status="In Progress",title="Overall Backlog" count=4 avg=3.396111111111111 med=3.2463888888888888 std=6.536649537037036
slo,team="QE Tools",status="Blocked",title="Overall Backlog" count=5 avg=11.00961111111111 med=7.204722222222222 std=91.1066115200617
slo,team="QE Tools",status="Workable",title="Workable Backlog" count=10 avg=15.109333333333334 med=18.405833333333334 std=49.890591755829895
slo,team="QE Tools",status="Feedback",title="In Feedback" count=13 avg=6.791153846153847 med=3.6266666666666665 std=69.11117296612852
Updated by livdywan almost 2 years ago
- Status changed from Feedback to In Progress
This is in progress.
Updated by okurz almost 2 years ago
- Related to action #47891: [functional][u] Continuously update the QSF-u team charts (cycle time, etc.) to know how we perform added
Updated by okurz almost 2 years ago
I looked up the original tickets #43442 and #47891, according code repo https://github.com/DrMullings/Scripts-Snippets-Stuff
Updated by okurz almost 2 years ago
As discussed "lead time" could be implemented by looking at "time_when_resolved - time_when_created", "cycle time" could be "time_when_resolved - time_when_assigned" which has the drawback that tickets that are assigned but not actively worked on also account for the cycle time but I consider those rare exceptions that maybe we can then avoid if the cycle time alerts us about those. An alternative is to sum up all times when ticket is in progress or feedback minus ticket in new or workable but I would leave that out for now.
Updated by livdywan almost 2 years ago
Talked about it briefly. @tinita raised the point that we should ideally measure each period of time a ticket is in progress, and scripts snippets stuff doesn't seem to handle that. I was taking a look at the journal data before, although it's not included in my branch so far, and I think that's doable.
Updated by livdywan almost 2 years ago
okurz wrote:
As discussed "lead time" could be implemented by looking at "time_when_resolved - time_when_created", "cycle time" could be "time_when_resolved - time_when_assigned" which has the drawback that tickets that are assigned but not actively worked on also account for the cycle time
Right. That includes the user story "Kim is assigning themself to a ticket a few days before actively working on it".
Updated by livdywan almost 2 years ago
- Due date changed from 2023-01-20 to 2023-01-27
- Status changed from In Progress to Workable
I'm not blocked here, but simply couldn't make time to work on the last step because of other tasks. The integration within Grafana also still needs to be tested, so even then we'd want to allow people to verify that we see according data on the dashboard.
Maybe I should for now make it Workable in case somebody else has spare cycles and is interested in working on it. Otherwise I plan to pick it up again next week.
Updated by livdywan almost 2 years ago
- Tags deleted (
mob)
To figure out how the data needs to look Tina and I took an example and analyzed it from the raw data to the query used in a Grafana dashboard:
- https://openqa.suse.de/admin/influxdb/jobs contains rows such as this one:
- openqa_jobs_by_worker,url=https://openqa.suse.de,worker=worker5 running=18i
- You can see that worker5 at one point in time has 18 running jobs
- https://stats.openqa-monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1&viewPanel=10&editPanel=10 processes this data
- For different intervals the dashboard shows the average number of running jobs per worker
Transferring this to the context of cycle time we want 1) the number of resolved tickets and 2) the sum of time those tickets spent "in progress", calculated in the backlogger code. An example of multiple days, assuming we feed data daily and look at the data from the last day, could look like so:
day1: count_resolved=1 sum_cycle_time=14
day2: count_resolved=0 sum_cycle_time=0 (no data)
day3: count_resolved=2 sum_cycle_time=10
Grafana will process this data and make available the average/median/whatever we choose in a query for a given time span.
Updated by livdywan almost 2 years ago
- Due date deleted (
2023-01-27)
Due date is not generally applicable to workable
Updated by livdywan almost 2 years ago
Apparently @ilausuch is now pursuing some grand plan to feed all Redmine and Bugzilla tickets into Grafana and make this accessible to all teams, see #123541. So we may not want to duplicate the effort at this point?
Updated by okurz almost 2 years ago
cdywan wrote:
Apparently @ilausuch is now pursuing some grand plan to feed all Redmine and Bugzilla tickets into Grafana and make this accessible to all teams, see #123541. So we may not want to duplicate the effort at this point?
I would be happy to learn more about that grand plan and how the QAC squad supports that. As long as that is just a private, personal idea I don't think we can rely on it. I would really prefer a simple approach for now that also provides something useful without relying on data in grafana. Like https://os-autoinst.github.io/qa-tools-backlog-assistant/ showing the cycle number?
Updated by livdywan almost 2 years ago
We already have an implementation. And I raised the point once more this week that it would be good to update the according tickets.
For all intents and purposes we'll proceed here as already discussed, I simply didn't have a chance to pick it up for unrelated reasons.
Updated by livdywan almost 2 years ago
- Status changed from Workable to In Progress
cdywan wrote:
For all intents and purposes we'll proceed here as already discussed, I simply didn't have a chance to pick it up for unrelated reasons.
As discussed in #note-22 I added the cycleTime based on the "in progress" time observed in the journal https://github.com/openSUSE/backlogger/pull/15 and it looks something like this:
slo,team="QE Tools",status="Workable",title="Workable Backlog" count=11
slo,team="QE Tools",status="Feedback",title="In Feedback" count=12
leadTime,team="QE Tools",status="Resolved",title="Closed within last 60 days" count=25 leadTime=672.4589777777778 cycleTime=73.2654111111111
Updated by openqa_review almost 2 years ago
- Due date set to 2023-03-15
Setting due date based on mean cycle time of SUSE QE Tools
Updated by livdywan almost 2 years ago
I proposed https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/500 in addition to deploying the script itself because we need an API to be available.
Updated by livdywan almost 2 years ago
Apparently I can manually invoke the script with sudo but Grafana won't recognize slo
as a metric i.e. selecting it in FROM
and count
doesn't show up as a field either 🤔
sudo env REDMINE_API_KEY=... /etc/telegraf/scripts/backlogger/backlogger.py --output=influxdb /etc/telegraf/scripts/tools-backlog/queries.yaml
leadTime
is absent in the output I see when running the script manually, but that's expected since I forgot to propose the PR for that: https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/33
Updated by okurz almost 2 years ago
On monitor.qa.suse.de journalctl -u telegraf
says:
Mar 10 11:00:03 openqa-monitor telegraf[1335]: 2023-03-10T10:00:03Z E! [inputs.exec] Error in plugin: metric parse error: expected field at 1:20: "slo,team=\"QE Tools\",status=\"New\",tit>
You can execute as test yourself:
sudo telegraf --test --config /etc/telegraf/telegraf.d/slo.conf
but of course you can do that without sudo locally.
Updated by livdywan almost 2 years ago
- Copied to action #125765: Make Telegraf errors visible in alert handling added
Updated by tinita almost 2 years ago
What I don't understand is why we are not using the approach as we discussed in our meeting and noted down in https://progress.opensuse.org/issues/121582#note-22
Of course the approach of calculating the cycle time in the python script and deliver it to grafana is easier, but it has drawbacks.
We only get a 60 day time frame. We don't have the possibility to tell Grafana to show the cycle time in, let's say, january compared to february, or the first two weeks of january vs. weeks 3 and 4.
If we picked a smaller timeframe than 60 days, then we wouldn't be able to let Grafana tell us the average of a bigger time frame. E.g. in the extreme case of "Week 1 has 1 ticket with cycle time 100 days, week 2 has 10 tickets with cycle time 10 days", what's the average cycle time of week 1 and 2? Grafana would show us 55 here, which is wrong, it would be ~18
Delivering the number of tickets and the sum of all their cycle times would enable us in Grafana to pick any time frame we want and get the average, calculated by Grafana itself.
Of course, the simple approach could be just enough for our needs, I just wanted to note the limits of it, so that we are not wondering about it later.
Updated by livdywan almost 2 years ago
tinita wrote:
Of course, the simple approach could be just enough for our needs, I just wanted to note the limits of it, so that we are not wondering about it later.
My comments should have given you the impression that I was following that approach to the best of my knowledge. If that's not the case I guess it needs to be fixed.
Updated by tinita almost 2 years ago
My comments should have given you the impression that I was following that approach to the best of my knowledge.
Right, I thought so, but taking a closer look, backlogger.py delivers the averages already, not the sums.
If that's not the case I guess it needs to be fixed.
Adding the sum of cycle and leadtimes should actually be easy, and then in Grafana we can figure out what we use as soon we have some data to play with, so I'll try to make a PR.
Updated by tinita almost 2 years ago
I just found out that https://progress.opensuse.org/issues.json?query_id=541 gives me 25 issues, but total_count is 119:
% curl -s "https://progress.opensuse.org/issues.json?query_id=541" | jq '.issues[] | .id' | wc -l
25
% curl -s "https://progress.opensuse.org/issues.json?query_id=541" | jq '.total_count'
119
So we might also suffer from the default limit. Adding per_page=100
doesn't change anything, and it wouldn't be enough for our numbers anyway.
edit: Ah, it needs to be limit
and not per_page
for JSON queries. But still, the upper limit is 100.
Updated by livdywan almost 2 years ago
okurz wrote:
Mar 10 11:00:03 openqa-monitor telegraf[1335]: 2023-03-10T10:00:03Z E! [inputs.exec] Error in plugin: metric parse error: expected field at 1:20: "slo,team=\"QE Tools\",status=\"New\",tit>
I can't say I understand why the format is being rejected, since the way I read the docs double quotes should be fine. It does pass if I strip spaces from the values.
I prepared a branch with a GitHub workflow based on a minimal telegraf config file anyway, which maybe I should've added in the first place: https://github.com/openSUSE/backlogger/pull/16
Updated by tinita almost 2 years ago
I can't say I understand why the format is being rejected, since the way I read the docs double quotes should be fine. It does pass if I strip spaces from the values.
Ah, it looks like spaces (and other chars) must be escaped. I searched for space and found this: weather,location\ place=us-midwest temperature=82 1465839830100400200
Updated by tinita almost 2 years ago
Given that for 60 days we get a lot of issues and would have to iterate through pages to work around the maximum limit of 100, I would say the approach of just taking a timeframe of 1 day and then let Grafana do the rest for us would be better (and also save Redmine some CPU ;-)
I just added the leadTimeSum
and cycleTimeSum
in this PR: https://github.com/openSUSE/backlogger/pull/17
So my suggestion would be to adapt the 541 query to 1 day now, and once we have data in Grafana, we can construct the correct Grafana query.
Updated by livdywan almost 2 years ago
tinita wrote:
So my suggestion would be to adapt the 541 query to 1 day now, and once we have data in Grafana, we can construct the correct Grafana query.
We should use a new query then, since I did not create this one. But otherwise sounds good to me.
tinita wrote:
I can't say I understand why the format is being rejected, since the way I read the docs double quotes should be fine. It does pass if I strip spaces from the values.
Ah, it looks like spaces (and other chars) must be escaped. I searched for space and found this:
weather,location\ place=us-midwest temperature=82 1465839830100400200
I still don't understand it but it seems escaping the space after the tags makes it work:
slo,host=tumbleweed.hel,status="Workable",team="Example",title="Workable Backlog"\ count=15 1678
Updated by tinita almost 2 years ago
I rather thought the spaces inside the fields should be escaped, e.g. title="Workable\ Backlog"
Updated by tinita almost 2 years ago
I added a query now: https://progress.opensuse.org/issues?query_id=773 QE tools team - closed yesterday
This way we also don't depend on at which exact time the query is run, as long as it is run once a day.
Updated by livdywan almost 2 years ago
https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/34
So it turns out the "last closed" query looked like this in markdown:
Closed yesterday | 0 | <1, >0 | 🔴
Updated by tinita almost 2 years ago
We should also investigate how to get better error messages from telegraf.
This was the error shown in the journal, when the redmine query returned a 403:
[inputs.exec] Error in plugin: exec: exit status 1 for command '/etc/telegraf/scripts/backlogger/backlogger.py --output=influxdb /etc/telegraf/scripts/tools-backlog/queries.yaml': Traceback (most recent call last):...
That's also all I could see when calling telegraf --test --config slo.conf
locally.
Note that the 3 dots at the end are literal and not added by me. I did not see any further explaining error there, but the backlogger output clearly showed the 403. So telegraf is swallowing useful information about errors.
Updated by tinita almost 2 years ago
It seems that truncating the error message to the first line is a feature :(
https://github.com/influxdata/telegraf/issues/5415
Updated by tinita almost 2 years ago
As a side note, the backlog ticket counts can already be watched in this graph: https://stats.openqa-monitor.qa.suse.de/d/1pHb56Lnk/tinas-dashboard?from=now-2d&to=now&viewPanel=22
Updated by livdywan almost 2 years ago
Next 3 steps before putting this in Feedback:
- Reintroduce the closed query to the QE Tools status
- Reduce frequency in Grafana to daily
- Save the new dashboard JSON in Salt
Updated by livdywan almost 2 years ago
cdywan wrote:
Next 3 steps before putting this in Feedback:
- Reintroduce the closed query to the QE Tools status
https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/35
Updated by livdywan almost 2 years ago
cdywan wrote:
Next 3 steps before putting this in Feedback:
- Reintroduce the closed query to the QE Tools status
- Reduce frequency in Grafana to daily
https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/810
Updated by tinita almost 2 years ago
A question about the data points we are feeding into influxdb.
This is the current output I get from --output=influxdb
:
slo,team="QE\ Tools",status="New",title="Overall\ Backlog" count=5
slo,team="QE\ Tools",status="Workable",title="Overall\ Backlog" count=17
slo,team="QE\ Tools",status="In\ Progress",title="Overall\ Backlog" count=6
slo,team="QE\ Tools",status="Blocked",title="Overall\ Backlog" count=51
slo,team="QE\ Tools",status="Feedback",title="Overall\ Backlog" count=17
slo,team="QE\ Tools",status="Workable",title="Workable\ Backlog" count=17
slo,team="QE\ Tools",status="In\ Progress",title="In\ Progress" count=6
slo,team="QE\ Tools",status="Feedback",title="In\ Feedback" count=20
So we are getting certain numbers multiple times, and even different numbers, e.g. for Feedback we get 17 and 20.
I looked into the code and it is not really clear to me now why we would go over all of the queries in queries.yaml and sum up numbers per ticket status from all of them.
Updated by livdywan almost 2 years ago
- Due date changed from 2023-03-15 to 2023-03-31
- Status changed from In Progress to Feedback
cdywan wrote:
cdywan wrote:
Next 3 steps before putting this in Feedback:
- Reintroduce the closed query to the QE Tools status
https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/35
This hasn't been merged yet because unfortunately we're again discussing the backlogger design in the downstream PR after having implemented it.
- Maybe we want to omit unrestricted queries afterall.
- Maybe we want separate queries.yaml files.
- Specifying
output
in queries.yaml has also been suggested.
I looked into the code and it is not really clear to me now why we would go over all of the queries in queries.yaml and sum up numbers per ticket status from all of them
Ashley Average wants to monitor the number of tickets in Feedback in Grafana.
That's the user story. I don't know if it's efficient or free of bugs. The answer as to "Why" is that it's the easiest way and requires no configuration.
Updated by livdywan almost 2 years ago
cdywan wrote:
cdywan wrote:
Next 3 steps before putting this in Feedback:
- Reintroduce the closed query to the QE Tools status
- Reduce frequency in Grafana to daily
https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/810
The change has been deployed but it doesn't seem to be effective...
sudo grep interval /etc/telegraf/telegraf.d/slo.conf
interval = "1d"
Updated by okurz almost 2 years ago
- Copied to action #126113: [tools][metrics] Only show queries in backlogger output that are relevant for the according output mode added
Updated by livdywan almost 2 years ago
cdywan wrote:
cdywan wrote:
Next 3 steps before putting this in Feedback:
- Reintroduce the closed query to the QE Tools status
https://github.com/os-autoinst/qa-tools-backlog-assistant/pull/35
Merged.
Let's give it time and prepare the new panel next week.
Oli will file a ticket on the output configuration.
Updated by livdywan almost 2 years ago
error loading config file /etc/telegraf/telegraf.d/slo.conf: error parsing exec, line 1:{0 286}: error parsing duration: time: unknown unit "d" in duration "1d"
Apparently 1d is not a valid unit for the interval: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/813
Updated by tinita over 1 year ago
I was thinking about the interval and possibly duplicate entries (if the query would run multiple times per day) and found this:
https://docs.influxdata.com/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points
So in order to not lose data if one query fails (and we only make one query per day), we could add the unique timestamp of the data (e.g. use the time 00:00:00 of the day the query is made) to the output and then run the query every 8 hours for example. Influxdb would still see it as one data point if they all have the same timestamp.
That strategy can also be used once we feed data for every individual ticket to influxdb.
Updated by tinita over 1 year ago
https://github.com/openSUSE/backlogger/pull/21 Add timestamp in nanoseconds for influxdb output
Updated by tinita over 1 year ago
- Description updated (diff)
We talked about adding also older data to influxdb, this should also be easy by using an according query. But for this we need to deliver the correct timestamp instead of just "today" like in my pull request.
Updated by tinita over 1 year ago
It seems this morning the telegraf query ran into a timeout, so it would be good to run the query more often again (as soon as we have merged the PR with the timestamp).
Mar 28 02:00:10 openqa-monitor telegraf[1315]: 2023-03-28T00:00:10Z E! [inputs.exec] Error in plugin: exec: command timed out for command '/etc/telegraf/scripts/backlogger/backlogger.py --output=influxdb /etc/telegraf/scripts/tools-backlog/queries.yaml':
It would also be good to have an option to feed data into influxdb for a certain past timeframe, like I mrntioned above. Maybe this should be a script on its own rather than putting all that functionality in the backlogger because we don't need all of this for the backlog status page.
Updated by livdywan over 1 year ago
tinita wrote:
It would also be good to have an option to feed data into influxdb for a certain past timeframe, like I mrntioned above. Maybe this should be a script on its own rather than putting all that functionality in the backlogger because we don't need all of this for the backlog status page.
Yes wrt refactoring into classes or separate projects. However please file a new ticket since imho this ticket is getting too big if we're including historical data and covering Redmine outages.
Updated by livdywan over 1 year ago
Discussed it briefly. We'll consider this resolved once 1) Tina's PR is merged 2) the frequency is hourly again 3) JSON for Tina's dashboard is in salt.
Updated by okurz over 1 year ago
What we discussed some days ago what I see as missing from https://stats.openqa-monitor.qa.suse.de/d/ck8uu5f4z/agile?orgId=1&refresh=30m:
- median values
- units, like days/hours
Updated by tinita over 1 year ago
We decided that strictly the AC will be fulfilled with the points @cdywan mentioned, and we can see some numbers for now, and wanted to split improvements to a new ticket.
My points would be:
- Feed individual ticket data to influxdb in order to calculate median
- Improve script to be able to print data for a certain day or timeframe (maybe split to its own script to not adding stuff that backlogger itself does not really need)
Should I create a new ticket?
Updated by okurz over 1 year ago
tinita wrote:
We decided that strictly the AC will be fulfilled with the points @cdywan mentioned, and we can see some numbers for now, and wanted to split improvements to a new ticket.
My points would be:
- Feed individual ticket data to influxdb in order to calculate median
- Improve script to be able to print data for a certain day or timeframe (maybe split to its own script to not adding stuff that backlogger itself does not really need)
Should I create a new ticket?
Well, either the suggested points are handled in this ticket or another one. I don't mind either way. Your choice
Updated by tinita over 1 year ago
https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/823 Decrease interval for backlogger query
Updated by tinita over 1 year ago
https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/823 merged, I restarted telegraf
Updated by livdywan over 1 year ago
cdywan wrote:
- Save the new dashboard JSON in Salt
https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/824
Updated by tinita over 1 year ago
- Copied to action #127025: [tools][metrics] Improve cycle + lead times in Grafana added
Updated by livdywan over 1 year ago
We have a working dashboard with metrics in it so I consider the AC's fulfilled. See #127025 for the suggested follow-up.