Project

General

Profile

Actions

action #138287

closed

petrol sometimes take a long time to respond/render http://localhost:9530/influxdb/minion

Added by nicksinger 7 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Observation

Sometimes pipelines (e.g. https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/1915033) fail with:

2023-10-19T13:14:13Z E! [inputs.http] Error in plugin: [url=http://localhost:9530/influxdb/minion]: Get "http://localhost:9530/influxdb/minion": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

It seems like the endpoint on that host sometimes takes a long time to respond:

petrol:~ # time curl http://localhost:9530/influxdb/minion
openqa_minion_jobs,url=http://localhost:9530 active=0i,delayed=0i,failed=19i,inactive=0i
openqa_minion_workers,url=http://localhost:9530 active=0i,inactive=1i,registered=1i
openqa_download_count,url=http://localhost:9530 count=0i
openqa_download_rate,url=http://localhost:9530 bytes=28359186i

real    0m0.008s
user    0m0.006s
sys 0m0.000s
petrol:~ # time curl http://localhost:9530/influxdb/minion
openqa_minion_jobs,url=http://localhost:9530 active=0i,delayed=0i,failed=19i,inactive=0i
openqa_minion_workers,url=http://localhost:9530 active=0i,inactive=1i,registered=1i
openqa_download_count,url=http://localhost:9530 count=0i
openqa_download_rate,url=http://localhost:9530 bytes=28359186i

real    0m0.008s
user    0m0.006s
sys 0m0.000s
petrol:~ # time curl http://localhost:9530/influxdb/minion
openqa_minion_jobs,url=http://localhost:9530 active=0i,delayed=0i,failed=19i,inactive=1i
openqa_minion_workers,url=http://localhost:9530 active=0i,inactive=1i,registered=1i
openqa_download_count,url=http://localhost:9530 count=0i
openqa_download_rate,url=http://localhost:9530 bytes=28359186i

real    0m6.242s
user    0m0.003s
sys 0m0.003s
petrol:~ # time curl http://localhost:9530/influxdb/minion
openqa_minion_jobs,url=http://localhost:9530 active=1i,delayed=0i,failed=19i,inactive=0i
openqa_minion_workers,url=http://localhost:9530 active=1i,inactive=0i,registered=1i
openqa_download_count,url=http://localhost:9530 count=1i
openqa_download_rate,url=http://localhost:9530 bytes=28359186i

real    0m11.547s
user    0m0.006s
sys 0m0.000s

Reproducible

Not sure what causes the long response times but I could easily reproduce it by running time curl http://localhost:9530/influxdb/minion a couple of times.

Expected result

The route should be quite snappy and not that slow. At the very least, if we cannot understand or fix the underlying problem our pipelines should not fail because of this.

Suggestions

  • Understand why that api endpoint needs so long to respond on only that host
  • Bump curl timeouts in our telegraf config
Actions

Also available in: Atom PDF