Project

General

Profile

action #90170

Updated by okurz 6 months ago

## Observation

Today observed on `openqaworker3` when checking for the reason of the failed systemd services alert:

```
martchus@openqaworker3:/srv/salt> sudo systemctl status purge-kernels.service
purge-kernels.service - Purge old kernels
Loaded: loaded (/usr/lib/systemd/system/purge-kernels.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2021-03-15 06:51:54 CET; 1 day 4h ago
Main PID: 1166 (code=exited, status=7)

Mar 15 06:51:36 openqaworker3 systemd[1]: Starting Purge old kernels...
Mar 15 06:51:54 openqaworker3 zypper[1166]: System management is locked by the application with pid 1286 (zypper).
Mar 15 06:51:54 openqaworker3 zypper[1166]: Close this application before trying again.
Mar 15 06:51:54 openqaworker3 systemd[1]: purge-kernels.service: Main process exited, code=exited, status=7/NOTRUNNING
Mar 15 06:51:54 openqaworker3 systemd[1]: Failed to start Purge old kernels.
Mar 15 06:51:54 openqaworker3 systemd[1]: purge-kernels.service: Unit entered failed state.
Mar 15 06:51:54 openqaworker3 systemd[1]: purge-kernels.service: Failed with result 'exit-code'.
```

Of course it helps to simply restart the service. Not sure how we could further improve this. This is likely just a caveat of openSUSE's `purge-kernels-service` package which provides that service (which likely comes from https://github.com/openSUSE/mkinitrd).

## Acceptance criteria
* **AC1:** The systemd service purge-kernels.service does not fail if zypper is running for a short time

## Suggestion
Report this issue upstream as bug and in the meantime apply a workaround for us, e.g. systemd service override with retry.

Back