action #71332
[alert] failed systemd service on openqaworker6: "display-manager"
0%
Description
Observation¶
https://stats.openqa-monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services alerted, points to openqaworker6. systemctl --failed
on that machine reveals that display-manager.service is at fault here, details from systemctl status display-manager
:
$ sudo systemctl status display-manager.service ● display-manager.service - X Display Manager Loaded: loaded (/usr/lib/systemd/system/display-manager.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Sun 2020-09-13 03:37:43 CEST; 1 day 11h ago Tasks: 21 CGroup: /system.slice/display-manager.service ├─11229 /usr/bin/xdm ├─11304 /usr/bin/X -nolisten tcp -br vt7 -keeptty -auth /var/lib/xdm/authdir/authfiles/A:0-zcYw1g ├─14468 -:0 └─14647 /usr/bin/xconsole -notify -nostdin -verbose -exitOnFail Sep 13 03:37:42 openqaworker6 display-manager[9118]: Command: localectl set-keymap de-nodeadkeys Sep 13 03:37:42 openqaworker6 display-manager[9118]: I: Using systemd /usr/share/systemd/kbd-model-map mapping Sep 13 03:37:43 openqaworker6 root[9951]: /etc/init.d/xdm: No changes for /etc/X11/xdm/Xservers Sep 13 03:37:43 openqaworker6 display-manager[9118]: Starting service xdm..failed Sep 13 03:37:43 openqaworker6 systemd[1]: display-manager.service: Control process exited, code=exited status=1 Sep 13 03:37:43 openqaworker6 systemd[1]: Failed to start X Display Manager. Sep 13 03:37:44 openqaworker6 systemd[1]: display-manager.service: Unit entered failed state. Sep 13 03:37:44 openqaworker6 systemd[1]: display-manager.service: Triggering OnFailure= dependencies. Sep 13 03:37:44 openqaworker6 systemd[1]: display-manager.service: Failed to enqueue OnFailure= job: No such file or directory Sep 13 03:37:44 openqaworker6 systemd[1]: display-manager.service: Failed with result 'exit-code'.
I wonder why we even try to start a display manager on that machine, should be headless.
History
#1
Updated by okurz 4 months ago
sudo salt -l error --state-output=changes -C 'G@roles:worker' cmd.run 'systemctl status display-manager'
reveals that openqaworker5, openqaworker6 and grenache-1 have a display manager, the others do not.
EDIT: Assuming that we do not actually need a display manager at all I did on openqaworker6
sudo systemctl disable --now display-manager && sudo systemctl reset-failed sudo zypper rm -u xdm
and a sudo salt -l error --state-output=changes -C 'openqaworker6*' state.apply
on osd does not bring back any packages so this should be enough as other workers also do not have this package. Did the same on all machines with
sudo salt -l error --state-output=changes -C 'G@roles:worker' cmd.run 'zypper -n rm -u xdm'
which also removed the package on openqaworker5 and grenache-1.
#2
Updated by nicksinger 4 months ago
I vote for a complete removal of all these weird X packages. Is there anything zypper can match against to tell it: "remove the graphic stack"?
#3
Updated by okurz 4 months ago
- Due date set to 2020-09-17
- Status changed from In Progress to Feedback
- Priority changed from Urgent to Normal
What I did now is fix the immediate problem and remove packages on all workers for "xdm" and dependencies which should not be necessary. The "graphic stack" should be a pattern "x11" but this was not installed. Monitoring over the next days if there could be any related openQA tests failing due to the removed services.