https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842018-09-04T14:42:38ZopenSUSE Project Management ToolopenQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1470982018-09-04T14:42:38Zcoolocoolo@suse.com
<ul></ul><p>Yeah, we're interested in this as well - for a bit longer with a bit more infos required</p>
openQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1471012018-09-04T14:42:50Zcoolocoolo@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed parent" href="/issues/18164">action #18164</a>: [devops][tools] monitoring of openqa worker instances</i> added</li></ul> openQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1473022018-09-05T21:17:10Zjberryjberry@suse.com
<ul></ul><p>Likely the best route would be to expose the information via HTTP where it can then be polled via telegraf on metrics.o.o. If standard system information like disk, cpu, and memory usage is also desired there are a variety of pre-existing solutions for that and can even have the data end up in grafana on metrics.o.o. Either telelgraf running on individual workers or exposing via munin agent or similar. I imagine you focus is the queue sizes and other metrics specific to openQA which would need to be exposed directly.</p>
openQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1494952018-09-14T12:47:06Zlnussellnussel@suse.com
<ul></ul><p>For a start a very simple graph would be the number of running and scheduled jobs. Can be polled easily with</p>
<p>curl <a href="https://openqa.opensuse.org/api/v1/jobs?state=running,scheduled" class="external">https://openqa.opensuse.org/api/v1/jobs?state=running,scheduled</a></p>
openQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1495282018-09-14T13:03:21Zcoolocoolo@suse.com
<ul></ul><p>Well, 'easily' possible - fast, no:</p>
<pre><code>coolo@openqa:~> time openqa-client jobs state=running,scheduled > /dev/null
real 0m42.334s
user 0m7.516s
</code></pre> openQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1557112018-10-11T17:35:56Zcoolocoolo@suse.com
<ul><li><strong>Subject</strong> changed from <i>visualize workload distribution</i> to <i>Provide job stats for telegraf to poll</i></li><li><strong>Target version</strong> set to <i>Current Sprint</i></li></ul><p>We need a rather fast json route to report the following infos:</p>
<p>number of running jobs<br>
number of blocked jobs<br>
number of non-blocked scheduled jobs</p>
<p>total, per job group, per ARCH, per worker (host)</p>
openQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1557142018-10-11T17:39:22Zcoolocoolo@suse.com
<ul></ul><p><a href="https://github.com/influxdata/telegraf/tree/master/plugins/parsers/json" class="external">https://github.com/influxdata/telegraf/tree/master/plugins/parsers/json</a> looks related</p>
openQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1565362018-10-14T10:21:46Zcoolocoolo@suse.com
<ul><li><strong>Assignee</strong> set to <i>coolo</i></li></ul><p><a href="https://github.com/os-autoinst/openQA/pull/1829" class="external">https://github.com/os-autoinst/openQA/pull/1829</a> results in</p>
<pre><code>{
"stats" : {
"running" : {
"by_group" : {
"openSUSE Leap 42.3 Updates" : 2,
"openSUSE Krypton" : 3,
"openSUSE Argon" : 2,
"openSUSE Leap 15 AArch64" : 2,
"openSUSE Leap 15.0 Updates" : 1
},
"total" : 10,
"by_arch" : {
"aarch64" : 2,
"x86_64" : 8
},
"by_host" : {
"openqaworker1" : 3,
"openqa-aarch64" : 2,
"openqaworker4" : 3,
"imagetester" : 2
}
},
"scheduled" : {
"by_group" : {
"openSUSE Leap 15 AArch64" : 5
},
"total" : 5,
"by_arch" : {
"aarch64" : 5
}
},
"blocked" : {
"by_group" : {
"openSUSE Leap 15 AArch64" : 1
},
"total" : 1,
"by_arch" : {
"aarch64" : 1
}
}
}
}
</code></pre> openQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1573792018-10-16T05:33:22Zcoolocoolo@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Resolved</i></li></ul><p><a href="https://github.com/openSUSE/openSUSE-release-tools/pull/1731" class="external">https://github.com/openSUSE/openSUSE-release-tools/pull/1731</a></p>
openQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1583242018-10-17T21:34:40Zjberryjberry@suse.com
<ul></ul><p>Deployed as <a href="https://metrics.opensuse.org/d/osrt_openqa/osrt-openqa" class="external">https://metrics.opensuse.org/d/osrt_openqa/osrt-openqa</a>.</p>
openQA Project - action #40583: Provide job stats for telegraf to pollhttps://progress.opensuse.org/issues/40583?journal_id=1690072018-11-27T09:39:05Zcoolocoolo@suse.com
<ul><li><strong>Target version</strong> changed from <i>Current Sprint</i> to <i>Done</i></li></ul>