action #109734
closedBetter way to prevent conflicts between openqa-worker@ and openqa-worker-auto-restart@ variants size:M
0%
Description
Motivation¶
Multiple times we have found users trying to start the systemd-services openqa-worker@
when instead we are running openqa-worker-auto-restart@
. Last instance when that happened was in #109055. We should try to find a better way which is less confusing to users. At best we should have only openqa-worker@
and use configuration to solve the auto-restart requirement.
Acceptance criteria¶
- AC1: We have an unambiguous solution for providing both variants of worker modes which are not confusing users
- AC2: Ensure documentation covers the updated way
Suggestions¶
- DONE: Research about systemd service best practices
- DONE: Research why we chose to have separate systemd service at the time -> Likely because we need different settings on systemd level
- DONE: Conduct a brainstorming session together, different ideas:
- Replace openqa-worker@ by a symlink pointing to the real solution, e.g. "openqa-worker-auto-restart" and "openqa-worker-plain", similar to "network.service" which for us is a symlink in /etc/systemd/system pointing to e.g. /usr/lib/systemd/system/NetworkManager.service or /usr/lib/systemd/system/wicked.service
- Alternative: just provide a drop-in file instead of separate systemd service file
- Alternative: Potentially provide the drop-in file in an openSUSE package
- Alternative: Solve it within a process itself so that systemd is not involved, e.g. same as hypnotoad or nginx
- Should the services be actual "conflicts" on the level of systemd? -> Likely yes, but not a full solution. Could be done on top
- Update documentation
Updated by okurz over 2 years ago
- Related to action #109055: Broken workers alert added
Updated by okurz over 2 years ago
- Subject changed from Better way to prevent conflicts between openqa-worker@ and openqa-worker-auto-restart@ variants to Better way to prevent conflicts between openqa-worker@ and openqa-worker-auto-restart@ variants size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by okurz over 2 years ago
Wicked solves it with the following code in the spec file:
%if %{with systemd}
%pre service
# upgrade from sysconfig[-network] scripts
_id=`readlink /etc/systemd/system/network.service 2>/dev/null` || :
if test "x${_id##*/}" = "xnetwork.service" -a -x /etc/init.d/network ; then
/etc/init.d/network stop-all-dhcp-clients || :
fi
%{service_add_pre wicked.service}
%post service
%{service_add_post wicked.service}
# See bnc#843526: presets do not apply for upgrade / are not sufficient
# to handle sysconfig-network|wicked -> wicked migration
_id=`readlink /etc/systemd/system/network.service 2>/dev/null` || :
case "${_id##*/}" in
""|wicked.service|network.service)
/usr/bin/systemctl --system daemon-reload || :
/usr/bin/systemctl --force enable wicked.service || :
;;
esac
%preun service
# stop the daemons on removal
# - stopping wickedd should be sufficient ... other just to be sure.
# - stopping of the wicked.service does not stop network, but removes
# the wicked.service --> network.service link and resets its status.
%{service_del_preun wickedd.service wickedd-auto4.service wickedd-dhcp4.service wickedd-dhcp6.service wickedd-nanny.service wicked.service}
%postun service
# restart wickedd after upgrade
%{service_del_postun wickedd.service}
…
%endif
however NetworkManager and also systemd-network do not provide that when I tried out the install in a container environment. I guess one should still research about a systemd ecosystem best practice here.
Updated by okurz over 2 years ago
- Tags set to reactive work
- Parent task deleted (
#80908)
Updated by jbaier_cz over 2 years ago
I think we can do the transition in to phases:
- Move from openqa-worker@.service to openqa-worker-plain@.service and provide symlink for the old service name pointing to the new one
- Update spec file to find out if openqa-worker@.service is a symlink and do not change it during updates
Did I miss something?
The PR for the first step is: https://github.com/os-autoinst/openQA/pull/4687
Updated by okurz over 2 years ago
- Due date set to 2022-06-16
- Status changed from Workable to In Progress
- Assignee set to jbaier_cz
Updated by jbaier_cz over 2 years ago
As the PR is merged, next steps should be doing some spec file magic to not overwrite openqa-worker@.service
symlink.
Updated by jbaier_cz over 2 years ago
- Status changed from In Progress to Feedback
I just realized there is even better mechanism to preserve user changes: just put them in the /etc
where they belong.
MR for salt to do exactly that: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/698
Updated by jbaier_cz over 2 years ago
- Status changed from Feedback to Resolved
Changes are merged and deployed, documentation was also updated in https://github.com/os-autoinst/openQA/pull/4695
systemd is listing the correct unit:
openqaworker5:~> systemctl cat openqa-worker@
# /usr/lib/systemd/system/openqa-worker-auto-restart@.service
...
So hopefully, this will help.
Updated by okurz over 2 years ago
- Status changed from Resolved to Feedback
please review https://gitlab.suse.de/openqa/salt-states-openqa#remarks-about-the-systemd-units-used-to-start-workers as well. Maybe we can simplify the section and make it less error-prone now.
Please also consider using the openqa-worker@ symlinked service definition directly
Updated by okurz over 2 years ago
- Due date deleted (
2022-06-16) - Status changed from Feedback to Resolved
https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/699 merged. Seems good.
Updated by jbaier_cz over 1 year ago
- Related to action #133352: Activating systemd target openqa-worker.target when openqa-worker-auto-restart@ is already used causes havoc size:M added