Project

General

Profile

Actions

action #152095

closed

openQA Project - coordination #112862: [saga][epic] Future ideas for easy multi-machine handling: MM-tests as first-class citizens

openQA Project - coordination #111929: [epic] Stable multi-machine tests covering multiple physical workers

[spike solution][timeboxed:8h] Ping over GRE tunnels and TAP devices and openvswitch outside a VM with differing packet sizes size:S

Added by okurz 5 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
-
Target version:
Start date:
2023-12-05
Due date:
% Done:

0%

Estimated time:

Description

Motivation

See lessons learned meeting #139136. We would again benefit from an easier reproducer. Related to #135818 . Come up with a way to ping over GRE tunnels and TAP devices and openvswitch outside a VM with differing packet sizes.

Acceptance criteria

  • AC1: We know how to ping over GRE tunnels and TAP devices and openvswitch outside a VM with differing packet sizes

Suggestions

  • Research upstream about pinging over specific interfaces, GRE tunnels, TAP devices, openvswitch, etc.

    • Like ping -I<interface> or ping X.X.X.X%tap0?
    • Checkout network namespaces and if they could be used
  • Research about MTU size debugging, tracepath, traceroute, etc.

  • Experiment in an openQA-environment or openQA-like with the bridges, tap devices, etc.

  • Demonstrate to the team in written form or interactively

  • Lookup how the existing check is done via a VM/VNC, and see how this could be simplified


Related issues 3 (1 open2 closed)

Related to openQA Project - action #152389: significant increase in MM-test failure ratio 2023-12-11: test fails in multipath_iscsi and other multi-machine scenarios due to MTU size auto_review:"ping with packet size 1350 failed, problems with MTU" size:MResolvedmkittler2023-12-11

Actions
Copied from openQA Infrastructure - action #152092: Handle all package downgrades in OSD infrastructure properly in salt size:MResolvednicksinger2023-12-05

Actions
Copied to openQA Infrastructure - action #152098: [research][timeboxed:10h] Learn more about openvswitch with experimenting together size:SWorkable2023-12-05

Actions
Actions #1

Updated by okurz 5 months ago

  • Copied from action #152092: Handle all package downgrades in OSD infrastructure properly in salt size:M added
Actions #2

Updated by okurz 5 months ago

  • Copied to action #152098: [research][timeboxed:10h] Learn more about openvswitch with experimenting together size:S added
Actions #3

Updated by okurz 5 months ago

  • Related to action #152389: significant increase in MM-test failure ratio 2023-12-11: test fails in multipath_iscsi and other multi-machine scenarios due to MTU size auto_review:"ping with packet size 1350 failed, problems with MTU" size:M added
Actions #4

Updated by okurz 3 months ago

  • Target version changed from Tools - Next to Ready
Actions #5

Updated by okurz 3 months ago

  • Priority changed from Normal to Low
Actions #6

Updated by okurz 3 months ago

  • Subject changed from [spike solution][timeboxed:10h] Ping over GRE tunnels and TAP devices and openvswitch outside a SUT with differing packet sizes to [spike solution][timeboxed:8h] Ping over GRE tunnels and TAP devices and openvswitch outside a VM with differing packet sizes size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #7

Updated by jbaier_cz 3 months ago

  • Assignee set to jbaier_cz
Actions #8

Updated by jbaier_cz 3 months ago

  • Status changed from Workable to In Progress
Actions #9

Updated by jbaier_cz 3 months ago ยท Edited

I did some initial experiments with the tunnels and openvswitch. The setup should be similar to openQA, but have some (important) differences. I will address them in the next step.

Initial setup for all experiments

# Enable ip forwarding
sysctl -w net.ipv4.ip_forward=1
sysctl -w net.ipv6.conf.all.forwarding=1
# Install and enable openvswitch
zypper in openvswitch3
systemctl enable --now openvswitch

Experiments

Host Network address Bridge address Remote IP
A 192.0.2.1/24 192.168.42.1/24 192.0.2.2
B 192.0.2.2/24 192.168.43.1/24 192.0.2.1

Note: instead of having two /24 networks, it is also possible to assign addresses from one bigger network (which have the benefit of not needing explicit route assignment).

Simple scenario

Two servers with a single bridge on each side connected with GRE tunnel.

# Create bridge and tunnel
nmcli con add type bridge con.int br0 bridge.stp yes ipv4.method manual ipv4.address "$bridge_address" ipv4.routes 192.168.42.0/23
nmcli con add type ip-tunnel mode gretap con.int gre1 master br0 remote "$remote_ip"

# Test the tunnel with ping
#   -M do   -- prohibit fragmentation
#   -s xxxx -- set packet size

ping -c 3 -M do -s 1300 192.168.42.1
ping -c 3 -M do -s 1300 192.168.43.1

Scenario with openvswitch

Two servers with a one virtual bridge connected with GRE tunnel.

# Create bridge, port and interface
nmcli con add type ovs-bridge con.int br0 ovs-bridge.stp-enable yes
nmcli con add type ovs-port con.int br0 con.master br0
nmcli con add type ovs-interface con.int br0 con.master br0 ipv4.method manual ipv4.address "$bridge_address" ipv4.routes 192.168.42.0/23

# Create GRE tunnel
nmcli con add type ovs-port con.int gre1 con.master br0
nmcli con add type ip-tunnel mode gretap con.int gre1 master gre1 remote "$remote_ip"

# Test the tunnel
ping -c 3 -M do -s 1300 192.168.42.1
ping -c 3 -M do -s 1300 192.168.43.1

#  ovs-vsctl show
de1f31e9-1b51-4cc3-954a-4e037191ac07
    Bridge br0
        Port br0
            Interface br0
                type: internal
        Port gre1
            Interface gre1
                type: system
    ovs_version: "3.1.0"

GRE tunnel made in openvswitch

openvswitch uses flow-based GRE tunneling, i.e. one interface gre_sys for all tunnels, the tunnel can be created by ovs-vsctl. After that, everything works as expected.

# Create bridge, port and interface
nmcli con add type ovs-bridge con.int br0 ovs-bridge.stp-enable yes
nmcli con add type ovs-port con.int br0 con.master br0
nmcli con add type ovs-interface con.int br0 con.master br0 ipv4.method manual ipv4.address "$bridge_address" ipv4.routes 192.168.42.0/23

# Create GRE tunnel
ovs-vsctl add-port br0 gre1 -- set interface gre1 type=gre options:remote_ip="$remote_ip"

# Test the tunnel
ping -c 3 -M do -s 1300 192.168.42.1
ping -c 3 -M do -s 1300 192.168.43.1
#  ovs-vsctl show
de1f31e9-1b51-4cc3-954a-4e037191ac07
    Bridge br0
        Port br0
            Interface br0
                type: internal
        Port gre1
            Interface gre1
                type: gre
                options: {remote_ip="192.0.2.2"}
    ovs_version: "3.1.0"

openQA-like

Each worker has the same 10.0.2.2/15 address set on the bridge interface and some extra openvswitch "magic" (i.e. os-autoinst-openvswitch) which allows the SUT to contact the worker machine via common address. This unfortunately renders the IP unusable for any inter-machine communication (pinging 10.0.2.2 from 10.0.2.2 just can't work).

Maybe the solution here is to just add an extra unique address for the bridge interface, which can be used for network checks. The added address does not even need to be from the same address space as long as we have correct routing tables.

Actions #10

Updated by jbaier_cz 3 months ago

  • Status changed from In Progress to Feedback

Investigation is probably completed, further discussion needed. I will try to bring this topic up on next occasion.

Actions #11

Updated by jbaier_cz 3 months ago

  • Tags changed from collaborative-session to collaborative-session, infra
Actions #12

Updated by okurz 3 months ago

  • Due date set to 2024-02-12
Actions #13

Updated by jbaier_cz 3 months ago

@okurz just to clarify, what would be the best place to put instructions in? A new subsection under https://progress.opensuse.org/projects/openqav3/wiki#Infrastructure-setup-for-o3-openqaopensuseorg-and-osd-openqasusede or is there better more generic place?

Actions #14

Updated by okurz 3 months ago

https://progress.opensuse.org/projects/openqav3/wiki#Infrastructure-setup-for-o3-openqaopensuseorg-and-osd-openqasusede is fine. A better place would likely be the openQA documentation but a little bit more effort to convert markdown to asciidoc. Your choice.

Actions #16

Updated by jbaier_cz 3 months ago

  • Due date deleted (2024-02-12)
  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF