action #120163
closedQA - coordination #121720: [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability
QA - coordination #116623: [epic] Migration of SUSE Nbg based openQA+QA+QAM systems to new security zones
Use salt grains instead of manually specifying IPs in "bridge_ip" size:M
0%
Description
Updated by okurz almost 2 years ago
- Copied from action #119443: Conduct the migration of SUSE openQA systems from Nbg SRV1 to new security zones size:M added
Updated by mkittler almost 2 years ago
- Assignee set to mkittler
I've already been creating a draft so I'll assign myself: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/770
Maybe it makes sense to test/apply this after the security zone migration to avoid doing too many things at once.
Updated by livdywan almost 2 years ago
- Subject changed from Use salt grains or something fancy instead of manually specifying IPs in "bridge_ip" to Use salt grains instead of manually specifying IPs in "bridge_ip" size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by okurz almost 2 years ago
mkittler wrote:
I've already been creating a draft so I'll assign myself: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/770
Maybe it makes sense to test/apply this after the security zone migration to avoid doing too many things at once.
@mkittler well, I like that you would like to prevent disruptions but the challenge is that we need to update the addresses anyway for the migration where not done already and if we would have FQDNs for the hosts that still need migration we would get that part "for free".
Updated by mkittler almost 2 years ago
- Status changed from Workable to In Progress
Ok, I'll try it out now on OSD then.
EDIT: The change generally doesn't break everything, e.g. sudo salt --no-color --state-output=changes -C 'G@roles:worker' cmd.run 'grep remote_ip /etc/wicked/scripts/gre_tunnel_preup.sh'
still shows the IPs as expected. The next step would be removing bridge_ip
from the pillars (e.g. in a few places as a start) to see whether then the fallback to use FQDNs works.
EDIT: It doesn't work. I cannot add anything to the salt mine and don't know how to continue. It appears like my change to mine.sls
is completely ignored despite having the change now also in the pillars repo.
Updated by openqa_review almost 2 years ago
- Due date set to 2022-11-29
Setting due date based on mean cycle time of SUSE QE Tools
Updated by mkittler almost 2 years ago
I'm stuck adding new information to the salt mine. I'll try to research what I might be missing or ask Nick tomorrow.
Updated by okurz almost 2 years ago
In the meantime mkittler could fix the issue with the help from nsinger, highly appreciated. https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/770 merged. Will you prepare an according change to our salt pillars now removing the bridge_ip settings?
Updated by okurz almost 2 years ago
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/460 merged 32 minutes ago. Please a sanity check, e.g. salt cmd.run
Updated by mkittler almost 2 years ago
- Status changed from In Progress to Resolved
The deployment pipeline has passed, the salt mine changes are effective and the config looks still sane.
Updated by mkittler almost 2 years ago
- Status changed from Resolved to Feedback
Looks like now there are some template rendering issues in the relevant code:
QA-Power8-5-kvm.qa.suse.de:
Data failed to compile:
----------
Rendering SLS 'base:openqa.openvswitch' failed: Jinja variable list object has no element 0
malbec.arch.suse.de:
Data failed to compile:
----------
Rendering SLS 'base:openqa.openvswitch' failed: Jinja variable list object has no element 0
worker11.oqa.suse.de:
Data failed to compile:
----------
Rendering SLS 'base:openqa.openvswitch' failed: Jinja variable list object has no element 0
worker3.oqa.suse.de:
Data failed to compile:
----------
Rendering SLS 'base:openqa.openvswitch' failed: Jinja variable list object has no element 0
worker6.oqa.suse.de:
Data failed to compile:
----------
Rendering SLS 'base:openqa.openvswitch' failed: Jinja variable list object has no element 0
worker5.oqa.suse.de:
Data failed to compile:
----------
Rendering SLS 'base:openqa.openvswitch' failed: Jinja variable list object has no element 0
worker8.oqa.suse.de:
Data failed to compile:
----------
Rendering SLS 'base:openqa.openvswitch' failed: Jinja variable list object has no element 0
worker9.oqa.suse.de:
Data failed to compile:
----------
Rendering SLS 'base:openqa.openvswitch' failed: Jinja variable list object has no element 0
worker12.oqa.suse.de:
Data failed to compile:
----------
Rendering SLS 'base:openqa.openvswitch' failed: Jinja variable list object has no element 0
worker10.oqa.suse.de:
Data failed to compile:
----------
Rendering SLS 'base:openqa.openvswitch' failed: Jinja variable list object has no element 0
(from https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/1252150)
Updated by mkittler almost 2 years ago
On OSD I've got only two occurences of "Data failed to compile" anymore and no "Rendering SLS" error:
martchus@openqa:~> sudo salt -l error --state-output=changes \* state.apply
worker5.oqa.suse.de:
Data failed to compile:
----------
The function "state.highstate" is running as PID 32342 and was started at 2022, Nov 22 15:40:32.201112 with jid 20221122154032201112
grenache-1.qa.suse.de:
Data failed to compile:
----------
The function "state.highstate" is running as PID 585787 and was started at 2022, Nov 22 15:39:52.397112 with jid 20221122153952397112
…
Summary for openqaworker-arm-2.suse.de
--------------
Succeeded: 397 (changed=4)
Failed: 0
--------------
Total states run: 397
Total run time: 82.397 s
…
ERROR: Minions returned with non-zero exit code
When I tried it again I've got:
martchus@openqa:~> sudo salt -l error --state-output=changes \* state.apply
openqaworker-arm-2.suse.de:
Data failed to compile:
----------
The function "state.highstate" is running as PID 48302 and was started at 2022, Nov 22 15:44:10.570942 with jid 20221122154410570942
…
Summary for worker5.oqa.suse.de
--------------
Succeeded: 498 (changed=4)
Failed: 0
--------------
Total states run: 498
Total run time: 50.702 s
…
Summary for grenache-1.qa.suse.de
--------------
Succeeded: 577 (changed=4)
Failed: 0
--------------
Total states run: 577
Total run time: 50.056 s
…
ERROR: Minions returned with non-zero exit code
So the result is the same except that this time a completely different worker runs into the error (and the ones that previously ran into it no longer run into it). I doubt that issue it the same as the "Rendering SLS" one which I wanted to reproduce (but apparently cannot reproduce). I've nevertheless looked into the issue but only found old bug reports that are likely not relevant (e.g. https://github.com/saltstack/salt/issues/16432 and https://github.com/saltstack/salt/issues/34362).
Updated by mkittler almost 2 years ago
- Status changed from Feedback to Resolved
We've discussed that in the unblock meeting.
About the first issue (#120163#note-11): It only happened once and could not be reproduced. It is likely a general issue with the mine that at some point was apparently not fully populated. So we can likely close this ticket for now (that only introduced yet another use of the mine) but keep it in mind should we see the problem again.
About the second issue (#120163#note-12): It is really unrelated and shouldn't block this ticket from being resolved. It is likely happening because the minion is still busy at the time one attempts to apply states again. I don't really understand it because on the previous run e.g. arm-2 succeeds (no timeout or anything) and on the next run it runs into the problem.
Updated by mkittler almost 2 years ago
- Related to action #120921: [alert] Salt states fail to compile with "Rendering SLS 'base:openqa.openvswitch' failed: Jinja error: argument of type 'NoneType' is not iterable" size:M added
Updated by mkittler almost 2 years ago
There's yet another problem with rendering that template. However, it now fails even earlier so I've been creating a separate ticket (and added it as related ticket).
Updated by mkittler almost 2 years ago
The issue #120921 was really not related at all.
Updated by mkittler almost 2 years ago
- Related to deleted (action #120921: [alert] Salt states fail to compile with "Rendering SLS 'base:openqa.openvswitch' failed: Jinja error: argument of type 'NoneType' is not iterable" size:M)