action #152557
closedopenQA Project (public) - coordination #112862: [saga][epic] Future ideas for easy multi-machine handling: MM-tests as first-class citizens
openQA Project (public) - coordination #111929: [epic] Stable multi-machine tests covering multiple physical workers
unexpected routing between PRG1/NUE2+PRG2
0%
Description
Observation¶
As brought up in https://mailman.suse.de/mlarch/SuSE/openvpn-info/2023/openvpn-info.2023.12/msg00022.html
there seems to be unexpected routing over outside routers which should stay within SUSE network. For example as observed on imagetester.qe.nue2.suse.org from tracepath worker40.oqa.prg2.suse.org
1?: [LOCALHOST] pmtu 1500
1: 10.168.195.254 0.126ms
1: 10.168.195.254 0.104ms
2: 10.168.195.254 0.136ms pmtu 1422
2: 195.135.223.4 4.590ms
3: worker40.oqa.prg2.suse.org 4.775ms reached
I don't know the host 195.135.223.4 but it does not look like a SUSE internal host that should be in the path.
That also has a considerable impact on effective MTU size.
Reproducible¶
Reproducible on all non-PRG2 salt machines to PRG2
ssh osd "sudo salt \* cmd.run 'tracepath worker40.oqa.prg2.suse.org'"
Acceptance criteria¶
- AC1: The expected route has been clarified
- AC2: The impact of routing on MTU in openQA multi-machine tests is understood
Suggestions¶
- Find out together with Eng-Infra what route is expected, e.g. SD ticket
Updated by okurz about 1 year ago
- Copied from action #152389: significant increase in MM-test failure ratio 2023-12-11: test fails in multipath_iscsi and other multi-machine scenarios due to MTU size auto_review:"ping with packet size 1350 failed, problems with MTU" size:M added
Updated by okurz about 1 year ago
- Status changed from In Progress to Blocked
- Priority changed from High to Normal
Updated by okurz about 1 year ago
- Status changed from Blocked to Resolved
https://sd.suse.com/servicedesk/customer/portal/1/SD-142223 is still open but not moving forward. However the main question was answered and the routing is expected as it is when there is traffic between datacenters. We have that already accommodated in our worker configuration and I think most of us have understood the impact.