coordination #102951
coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes
[epic] Better network performance monitoring
100%
Description
Motivation¶
See #102882
Acceptance criteria¶
- AC1: The up-/download rates are available to admins for investigation
Suggestions¶
Record the up-/download rates from cache service downloads and maybe put them into influxdb API routes or a log message OR Can we run the according iperf3 commands periodically in our monitoring? I guess just some seconds every hour should provide enough data and we can smooth in grafana: There is this open request with an simple exec example: https://github.com/influxdata/telegraf/issues/3866#issuecomment-694429507 - this should work for our use-case. We just need to make sure not to run all requests at the same time to all workers because it would quite easily saturate the whole link of OSD
Subtasks
Related issues
History
#1
Updated by okurz over 1 year ago
- Copied from coordination #102882: [epic] All OSD PPC64LE workers except malbec appear to have horribly broken cache service added
#2
Updated by okurz over 1 year ago
- Project changed from openQA Infrastructure to openQA Project
- Category set to Feature requests
#3
Updated by okurz over 1 year ago
- Tracker changed from action to coordination
- Subject changed from Better network performance monitoring to [epic] Better network performance monitoring
- Status changed from New to Blocked
- Assignee set to okurz
- Parent task set to #80142
#4
Updated by okurz over 1 year ago
- Status changed from Blocked to Workable
- Assignee deleted (
okurz)
one subtask resolved. Based on what we can come up with more subtasks
#5
Updated by cdywan over 1 year ago
okurz wrote:
one subtask resolved. Based on what we can come up with more subtasks
Let's talk about it in the Unblock tomorrow
#6
Updated by okurz over 1 year ago
- Status changed from Workable to Blocked
- Assignee set to okurz
#7
Updated by okurz about 1 year ago
- Status changed from Blocked to Resolved
All subtasks completed, AC1 fulfilled.
#8
Updated by okurz about 1 year ago
- Related to action #110497: Minion influxdb data causing unusual download rates size:M added