Project

General

Profile

Actions

action #41885

closed

action #41867: [devops][tools] Replace get-metrics script by telegraf

[devops][functional][u] Collectd plugin to report data from workers

Added by szarate about 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
SUSE QA (private) - Milestone 20
Start date:
2018-10-02
Due date:
% Done:

0%

Estimated time:

Description

To phase out get-metrics, status of systemd instances need to be reported, along with the last job that the worker ran and number of configured worker instances


Files

gauge-systemd_service_6.rrd (145 KB) gauge-systemd_service_6.rrd Example file szarate, 2018-10-25 08:35
Actions #1

Updated by szarate about 6 years ago

picking this task right away

Actions #2

Updated by szarate about 6 years ago

  • Status changed from New to In Progress
Actions #3

Updated by coolo about 6 years ago

  • Project changed from openQA Project (public) to openQA Infrastructure (public)
Actions #4

Updated by okurz about 6 years ago

  • Subject changed from [devops] Collectd plugin to report data from workers to [devops][functional][u] Collectd plugin to report data from workers
  • Target version changed from Current Sprint to Milestone 20

szarate joined qsf-u

Actions #5

Updated by szarate about 6 years ago

So after talking to Nick yesterday, seems that we're dropping collectd, however I already had something, so I polished "a bit" and created a repo in my personal github account. The exersice was fun though.

Currently it will only report the output of systemctl is-active openqa-worker@$instance, the number of worker instances is configurable via collectd plugin, what is missing:

  • Reporting last job that the worker ran
  • Check whether the openqa-worker service is actually running (Would need something like a worker heartbeat, or perhaps some other fancy thing that allows querying the information)
  • Register proper datatypes that match better what systemd reports on a first level, also to report information from jobs, and statistics from them.

The plugin is here: https://github.com/foursixnine/Collectd-Plugins-openQA

While the documentation should be enough, I'm leaving here what I used to set it up:

<Plugin perl>
        IncludeDir "/home/foursixnine/Projects/foursixnine.io/openqa-collectd/lib"
        BaseName "Collectd::Plugins"
        LoadPlugin openQA

        <Plugin openQA>
                worker_instances 10
        </Plugin>
</Plugin>

The output looks like this:

[2018-10-25 01:51:49] Dispatching: systemctl is-active --quiet openqa-worker@7                                                                                                                                     
 {                                                                                                                                                                                                                 
  plugin => "openQA-worker",                                                                                                                                                                                       
  type => "gauge",                                                                                                                                                                                                 
  type_instance => "systemd_service_7",                                                                                                                                                                            
  values => [3],                                                                                                                                                                                                   
}

I'm also attaching an example file.

/var/lib/collectd/rrd/phobos.suse.de/openQA-worker/gauge-systemd_service_6.rrd

Actions

Also available in: Atom PDF