Project

General

Profile

coordination #102951

coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes

[epic] Better network performance monitoring

Added by okurz about 2 months ago. Updated about 1 month ago.

Status:
Workable
Priority:
Normal
Assignee:
-
Category:
Feature requests
Target version:
Start date:
2021-11-24
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Difficulty:

Description

Motivation

See #102882

Acceptance criteria

  • AC1: The up-/download rates are available to admins for investigation

Suggestions

Record the up-/download rates from cache service downloads and maybe put them into influxdb API routes or a log message OR Can we run the according iperf3 commands periodically in our monitoring? I guess just some seconds every hour should provide enough data and we can smooth in grafana: There is this open request with an simple exec example: https://github.com/influxdata/telegraf/issues/3866#issuecomment-694429507 - this should work for our use-case. We just need to make sure not to run all requests at the same time to all workers because it would quite easily saturate the whole link of OSD


Subtasks

action #102957: Better network performance monitoring - up-/download speed from cache service, e.g. in log file size:MResolvedkraih


Related issues

Copied from openQA Infrastructure - action #102882: All OSD PPC64LE workers except malbec appear to have horribly broken cache serviceFeedback2021-11-232022-01-28

History

#1 Updated by okurz about 2 months ago

  • Copied from action #102882: All OSD PPC64LE workers except malbec appear to have horribly broken cache service added

#2 Updated by okurz about 2 months ago

  • Project changed from openQA Infrastructure to openQA Project
  • Category set to Feature requests

#3 Updated by okurz about 2 months ago

  • Tracker changed from action to coordination
  • Subject changed from Better network performance monitoring to [epic] Better network performance monitoring
  • Status changed from New to Blocked
  • Assignee set to okurz
  • Parent task set to #80142

#4 Updated by okurz about 1 month ago

  • Status changed from Blocked to Workable
  • Assignee deleted (okurz)

one subtask resolved. Based on what we can come up with more subtasks

Also available in: Atom PDF