action #153769
closedcoordination #112862: [saga][epic] Future ideas for easy multi-machine handling: MM-tests as first-class citizens
coordination #111929: [epic] Stable multi-machine tests covering multiple physical workers
Better handle changes in GRE tunnel configuration size:M
Start date:
Due date:
% Done:
Estimated time:
When changing the GRE tunnel configuration (/etc/wicked/scripts/
) by changing related salt states or workerconf.sls
in pillars these changes are not applied automatically unlike worker settings. This can lead to openQA test failures due to inconsistencies as well as potentially incomplete routing due to STP selections.
Acceptance criteria¶
- AC1: We are able to change the GRE tunnel configuration on any salt-controlled openQA worker without causing openQA test failures
- Run
ovs-appctl stp/show
like on all workers to see how it currently routes packages - In the best case our salt states handle this automatically. It would be possible to simply re-run
after it has changed.- Adding/removing ports will cause a temporary unavailability of the network and thus disrupt tests.
- Stop the services, re-run the script and finally start the services again?
- If necessary reboot the host (not sure how easy this is to trigger from salt states).
- In the worst case we make sure the limitation is properly documented with instructions to follow (e.g. command to reboot all workers).
- So simply try out to rerun /etc/wicked/scripts/ in salt after it has changed and monitor for bad consequences
- Monitor
- If nothing bad happened then assume we are done, else try to trigger reboots
Further details¶
- Checkout #152389#note-63 and subsequent comments for further context.
Updated by mkittler about 1 year ago
- Related to action #152389: significant increase in MM-test failure ratio 2023-12-11: test fails in multipath_iscsi and other multi-machine scenarios due to MTU size auto_review:"ping with packet size 1350 failed, problems with MTU" size:M added
Updated by okurz about 1 year ago
- Tags set to infra, salt, gre, multi-machine
- Target version set to future
- Parent task set to #111929
Updated by okurz about 1 year ago
- Subject changed from Better handle changes in GRE tunnel configuration to Better handle changes in GRE tunnel configuration size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by okurz about 1 year ago
- Related to action #154552: [ppc64le] test fails in iscsi_client - zypper reports Error Message: Could not resolve host: added
Updated by okurz about 1 year ago
- Status changed from Workable to In Progress
- Assignee set to okurz
Updated by okurz about 1 year ago
- Status changed from In Progress to Feedback
Updated by okurz about 1 year ago
Now proposing in salt:
# Alternative to call 'systemctl restart network'
# If network reinitialization is not enough we could still go with
wicked ifup all:
- onchanges:
- file: /etc/wicked/scripts/
Updated by okurz about 1 year ago
- Target version changed from Ready to Tools - Next
Updated by okurz 12 months ago
- Status changed from Feedback to Resolved
- Target version changed from Tools - Next to Ready
no more problems observed in the past days. Maybe helped but also #155929 is a likely candidate.