Project

General

Profile

Actions

action #133700

open

openQA Project - coordination #155485: [saga][epic] Efficient openQA worker pool resource handling in datacenters

coordination #158374: [epic] Prevention of inefficient hardware resource use

Network bandwidth graphs per switch, like https://mrtg.suse.de/qanet13nue, for all current top-of-rack switches (TORs) that we are connected to size:M

Added by okurz 9 months ago. Updated 27 days ago.

Status:
Blocked
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2023-08-02
Due date:
% Done:

0%

Estimated time:

Description

Motivation

Sometimes or often enough there are various network related issues. To find out the available bandwidth or bottle necks graphs like https://mrtg.suse.de/qanet13nue/index.html can be quite helpful:

Screenshot_20230802_205949.png

We have those available for NUE1 based switches but I would not know about NUE2 or PRG2 so we should research how something equivalent is possible and ensure everybody within our team would be able to reach the according graphs.

Acceptance criteria

  • AC1: Graphs like from mrtg.suse.de are available to all current SUSE QE Tools members for common racks, e.g. FC_Basement-B1..5, PRG2-J11+PRG2-J12
  • AC2: The team knows how to reach those

Suggestions


Files

Screenshot_20230802_205949.png (83.6 KB) Screenshot_20230802_205949.png Screenshot_20230802_205949.png okurz, 2023-08-02 19:00

Related issues 1 (0 open1 closed)

Related to openQA Project - action #138698: significant increase in multi-machine test failures on OSD since 2023-10-25, e.g. test fails in support_server/setup size:MResolvedmkittler2023-10-27

Actions
Actions #1

Updated by mkittler 9 months ago

  • Subject changed from Network bandwidth graphs per switch, like https://mrtg.suse.de/qanet13nue/index.html, for all current top-of-rack switches (TORs) that we are connected to to Network bandwidth graphs per switch, like https://mrtg.suse.de/qanet13nue, for all current top-of-rack switches (TORs) that we are connected to size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #2

Updated by okurz 8 months ago

  • Target version changed from Ready to Tools - Next
Actions #3

Updated by okurz 6 months ago

  • Target version changed from Tools - Next to Ready
Actions #6

Updated by okurz 6 months ago

  • Related to action #138698: significant increase in multi-machine test failures on OSD since 2023-10-25, e.g. test fails in support_server/setup size:M added
Actions #7

Updated by okurz 5 months ago

  • Assignee set to bschmidt

Please just ask in Slack channel #dct-migration https://app.slack.com/client/T02863RC2AC/C04MDKHQE20?cdn_fallback=2 with a back-reference to this ticket.

Actions #9

Updated by bschmidt 5 months ago

Just adding the slack conversation here:

Jiri Novak
for switches that will be in zabbix, it would make sense to do some dashboard there. but that is not finished yet

Oliver Kurz
I am sure the switches already have something built in. Can we have access to that?

Birger Schmidt
@Jiri Novak Im not quite sure if I understand what you are saying.
Is the info already in zabbix and just the dashboard is missing or is the data collection is not yet done at all?
and if there is nothing in zabbix, is the temporary solution that Oli is pointing to possible in the meantime?

Jiri Novak
collection is not done yet
sorry i'm busy last days with huge migration, so i didn't bother to check which switches are talking about. if it's the enginfra managed ones, they're in separate unrouted network, so no (edited)

Oliver Kurz
@Moroni Flores can you help to ensure this is planned accordingly?

Actions #10

Updated by bschmidt 5 months ago

  • Status changed from Workable to Blocked
Actions #11

Updated by bschmidt 5 months ago

Moroni Flores
this and all the monitoring tasks are planned for dec-feb

Oliver Kurz
That sounds great! Any card we can subscribe to or do I need to bug you again? :)

Moroni Flores
I don’t mind if you bug me again but here is the card https://jira.suse.com/browse/ENGINFRA-1893

Oliver Kurz
@Birger Schmidt
now we can set https://progress.opensuse.org/issues/133700 to "Blocked" referencing https://jira.suse.com/browse/ENGINFRA-1893

Actions #12

Updated by okurz 4 months ago

  • Assignee changed from bschmidt to okurz
  • Target version changed from Ready to Tools - Next
Actions #13

Updated by okurz 2 months ago

  • Target version changed from Tools - Next to future
Actions #14

Updated by okurz 27 days ago

Actions #15

Updated by okurz 27 days ago

  • Parent task set to #158374
Actions

Also available in: Atom PDF