coordination #102951
closedcoordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes
[epic] Better network performance monitoring
Description
Motivation¶
See #102882
Acceptance criteria¶
- AC1: The up-/download rates are available to admins for investigation
Suggestions¶
Record the up-/download rates from cache service downloads and maybe put them into influxdb API routes or a log message OR Can we run the according iperf3 commands periodically in our monitoring? I guess just some seconds every hour should provide enough data and we can smooth in grafana: There is this open request with an simple exec example: https://github.com/influxdata/telegraf/issues/3866#issuecomment-694429507 - this should work for our use-case. We just need to make sure not to run all requests at the same time to all workers because it would quite easily saturate the whole link of OSD
Updated by okurz almost 3 years ago
- Copied from coordination #102882: [epic] All OSD PPC64LE workers except malbec appear to have horribly broken cache service added
Updated by okurz almost 3 years ago
- Project changed from openQA Infrastructure to openQA Project
- Category set to Feature requests
Updated by okurz almost 3 years ago
- Tracker changed from action to coordination
- Subject changed from Better network performance monitoring to [epic] Better network performance monitoring
- Status changed from New to Blocked
- Assignee set to okurz
- Parent task set to #80142
Updated by okurz almost 3 years ago
- Status changed from Blocked to Workable
- Assignee deleted (
okurz)
one subtask resolved. Based on what we can come up with more subtasks
Updated by livdywan over 2 years ago
okurz wrote:
one subtask resolved. Based on what we can come up with more subtasks
Let's talk about it in the Unblock tomorrow
Updated by okurz over 2 years ago
- Status changed from Workable to Blocked
- Assignee set to okurz
Updated by okurz over 2 years ago
- Status changed from Blocked to Resolved
All subtasks completed, AC1 fulfilled.
Updated by okurz over 2 years ago
- Related to action #110497: Minion influxdb data causing unusual download rates size:M added