action #41885
closedaction #41867: [devops][tools] Replace get-metrics script by telegraf
[devops][functional][u] Collectd plugin to report data from workers
0%
Description
To phase out get-metrics, status of systemd instances need to be reported, along with the last job that the worker ran and number of configured worker instances
Files
Updated by coolo about 6 years ago
- Project changed from openQA Project (public) to openQA Infrastructure (public)
Updated by okurz about 6 years ago
- Subject changed from [devops] Collectd plugin to report data from workers to [devops][functional][u] Collectd plugin to report data from workers
- Target version changed from Current Sprint to Milestone 20
szarate joined qsf-u
Updated by szarate about 6 years ago
- File gauge-systemd_service_6.rrd gauge-systemd_service_6.rrd added
- Status changed from In Progress to Resolved
So after talking to Nick yesterday, seems that we're dropping collectd, however I already had something, so I polished "a bit" and created a repo in my personal github account. The exersice was fun though.
Currently it will only report the output of systemctl is-active openqa-worker@$instance, the number of worker instances is configurable via collectd plugin, what is missing:
- Reporting last job that the worker ran
- Check whether the openqa-worker service is actually running (Would need something like a worker heartbeat, or perhaps some other fancy thing that allows querying the information)
- Register proper datatypes that match better what systemd reports on a first level, also to report information from jobs, and statistics from them.
The plugin is here: https://github.com/foursixnine/Collectd-Plugins-openQA
While the documentation should be enough, I'm leaving here what I used to set it up:
<Plugin perl>
IncludeDir "/home/foursixnine/Projects/foursixnine.io/openqa-collectd/lib"
BaseName "Collectd::Plugins"
LoadPlugin openQA
<Plugin openQA>
worker_instances 10
</Plugin>
</Plugin>
The output looks like this:
[2018-10-25 01:51:49] Dispatching: systemctl is-active --quiet openqa-worker@7
{
plugin => "openQA-worker",
type => "gauge",
type_instance => "systemd_service_7",
values => [3],
}
I'm also attaching an example file.
/var/lib/collectd/rrd/phobos.suse.de/openQA-worker/gauge-systemd_service_6.rrd