Actions
action #106666
closedImprove worker startup in our salt states or "openqa-worker-auto-restart repeatedly failing on grenache-1.qa.suse.de"
Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
2022-02-11
Due date:
% Done:
0%
Estimated time:
Description
Motivation¶
It can happen that we disable single worker-instances on openQA workers (e.g. https://progress.opensuse.org/issues/106257#note-9). If we use the mask approach it results in our deployment pipeline failing because our states try to start every worker instance configured in the "numofworkers" field (https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls#L44) this happens here: https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/master/openqa/worker.sls#L190-194
So even commenting out the affected instances wouldn't work.
Suggestions¶
The following flow would allow us to just comment out instances in addition to mask them manually:
- Iterate over every key for each worker (https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls#L52) and use their instance number to explicitly start them
- Take the last, explicitly defined instance number, subtract it from "numofworkers", start only the remaining instances
Actions