Project

General

Profile

Actions

action #168145

open

coordination #161414: [epic] Improved salt based infrastructure management

implement telegraf health check and adjust according pipelines

Added by nicksinger about 1 month ago. Updated about 1 month ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Motivation

In #167051 we discovered that our testing of telegraf is not optimal and @nicksinger stumbled over https://github.com/influxdata/telegraf/tree/master/plugins/outputs/health#health-output-plugin - this could be used to replace the current pipeline approach of logging into the monitoring host and executing telegraf --test. Instead the healtcheck could be configured to only monitor relevant plugins (e.g. only inputs but no custom scripts) and polled on demand (e.g. in pipelines crated by MRs) to instantly inform about the health of that service and gathered metrics.


Related issues 2 (1 open1 closed)

Copied from openQA Infrastructure - action #167051: https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/3109145 failed due to telegraf errors on monitor.qa.suse.de size:SResolvednicksinger2024-09-19

Actions
Copied to openQA Infrastructure - action #168148: hackweek idea: use loki to monitor our log files and explore alerting possibilites based on these size:SWorkablenicksinger

Actions
Actions #1

Updated by nicksinger about 1 month ago

  • Copied from action #167051: https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/3109145 failed due to telegraf errors on monitor.qa.suse.de size:S added
Actions #2

Updated by nicksinger about 1 month ago

  • Subject changed from implement telegraf health check and adjust according pipelines size:S to implement telegraf health check and adjust according pipelines
Actions #3

Updated by nicksinger about 1 month ago

  • Copied to action #168148: hackweek idea: use loki to monitor our log files and explore alerting possibilites based on these size:S added
Actions #4

Updated by okurz about 1 month ago

  • Target version set to future
Actions

Also available in: Atom PDF