Actions
action #153769
closedcoordination #112862: [saga][epic] Future ideas for easy multi-machine handling: MM-tests as first-class citizens
coordination #111929: [epic] Stable multi-machine tests covering multiple physical workers
Better handle changes in GRE tunnel configuration size:M
Start date:
2024-01-17
Due date:
% Done:
0%
Estimated time:
Tags:
Description
Motivation¶
When changing the GRE tunnel configuration (/etc/wicked/scripts/gre_tunnel_preup.sh
) by changing related salt states or workerconf.sls
in pillars these changes are not applied automatically unlike worker settings. This can lead to openQA test failures due to inconsistencies as well as potentially incomplete routing due to STP selections.
Acceptance criteria¶
- AC1: We are able to change the GRE tunnel configuration on any salt-controlled openQA worker without causing openQA test failures
Suggestions¶
- Run
ovs-appctl stp/show
like on all workers to see how it currently routes packages - In the best case our salt states handle this automatically. It would be possible to simply re-run
/etc/wicked/scripts/gre_tunnel_preup.sh
after it has changed.- Adding/removing ports will cause a temporary unavailability of the network and thus disrupt tests.
- Stop the services, re-run the script and finally start the services again?
- If necessary reboot the host (not sure how easy this is to trigger from salt states).
- In the worst case we make sure the limitation is properly documented with instructions to follow (e.g. command to reboot all workers).
- So simply try out to rerun /etc/wicked/scripts/gre_tunnel_preup.sh in salt after it has changed and monitor for bad consequences
- Monitor https://monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1&viewPanel=24
- If nothing bad happened then assume we are done, else try to trigger reboots
Further details¶
- Checkout #152389#note-63 and subsequent comments for further context.
Updated by mkittler 11 months ago
- Related to action #152389: significant increase in MM-test failure ratio 2023-12-11: test fails in multipath_iscsi and other multi-machine scenarios due to MTU size auto_review:"ping with packet size 1350 failed, problems with MTU" size:M added
Updated by okurz 11 months ago
- Related to action #154552: [ppc64le] test fails in iscsi_client - zypper reports Error Message: Could not resolve host: openqa.suse.de added
Updated by okurz 10 months ago
- Status changed from Feedback to Resolved
- Target version changed from Tools - Next to Ready
no more problems observed in the past days. Maybe https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1105 helped but also #155929 is a likely candidate.
Actions