action #125765
Make Telegraf errors visible in alert handling
Start date:
2022-12-06
Due date:
% Done:
0%
Estimated time:
Description
Motivation¶
In the context of #121582 the deployed InfluxDB input wouldn't seem to be picked up by Grafana but we also saw no issues with deployment or alerts to explain that it was broken.
Acceptance criteria¶
- AC1: The team is aware of errors in Telegraf inputs
Suggestions¶
- Run
sudo telegraf --test --config /etc/telegraf/telegraf.d/slo.conf
with the according config filename. By default only one config file will be used - Use logwarn (c.f. openqa logwarn)
- Use https://grafana.com/oss/loki/ (maybe overkill?)
Related issues
History
#1
Updated by cdywan 3 months ago
- Copied from action #121582: [tools][metrics] Calculate cycle + lead times for SUSE QE Tools continuously size:M added
#2
Updated by okurz 3 months ago
- Tags set to telegraf, salt
- Due date deleted (
2023-03-15)
nicksinger mentioned https://github.com/influxdata/telegraf/tree/master/plugins/inputs/tail
My proposal: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/805
#3
Updated by cdywan 3 months ago
- Status changed from New to Feedback
- Assignee set to okurz
okurz wrote:
My proposal: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/805
The MR was reviewed and merged, let's see if this is just fine. Hence putting it in feedback (and if for some reason it's not we can of course still reset the ticket and consider more elaborate options)