action #134177
closed[security][maintenance] test fails in sssd_389ds_functional during docker exec auto_review:"Test died.*docker exec ds389_container dscreate.*failed at":force_result:softfailed
0%
Description
- https://openqa.suse.de/tests/11818371#step/sssd_389ds_functional/46
- https://openqa.suse.de/tests/11818371#step/sssd_389ds_functional/48
Starting installation ...
Validate installation settings ...
Create file system structures ...
Create self-signed certificate database ...
Error: Command '['systemctl', 'start', 'dirsrv@frist389']' returned non-zero exit status 1.
- Fails since (at least) Build 20230811-1
- Last good: 20230810-1 (or more recent)
- Latest result: latest
Updated by tjyrinki_suse over 1 year ago
- Subject changed from [security] test fails in sssd_389ds_functional during docker exec to [security][maintenance] test fails in sssd_389ds_functional during docker exec
- Status changed from New to Workable
- Priority changed from Normal to Urgent
- Start date deleted (
2023-08-14)
Updated by tjyrinki_suse over 1 year ago
- Estimated time set to 8.00 h
I see this has been soft-failed already in the past, but because it's blocking our visibility to the actual test about sssd it is also a high priority of getting really fixed.
There is something wrong with the docker creation that should be investigated. At least a simply bump from version 15.4 to 15.5 did not help: https://openqa.suse.de/tests/11896611
The test seems to only test the "real" system under test regarding the sssd, while 389ds is auxiliary and comes from a container (which is not necessarily the same version, and not with the same updates applied, as the host).
Updated by tjyrinki_suse over 1 year ago
- Subject changed from [security][maintenance] test fails in sssd_389ds_functional during docker exec to [security][maintenance] test fails in sssd_389ds_functional during docker exec auto_review:"Test died.*Hello from the server.*timed out at":force_result:softfailed
Feel free to work on it normally, just setting subject line to try if auto_review feature works.
Updated by tjyrinki_suse over 1 year ago
- Subject changed from [security][maintenance] test fails in sssd_389ds_functional during docker exec auto_review:"Test died.*Hello from the server.*timed out at":force_result:softfailed to [security][maintenance] test fails in sssd_389ds_functional during docker exec auto_review:"Test died.*docker exec ds389_container dscreate.*failed at":force_result:softfailed
correction
Updated by FSzekely over 1 year ago
A little debugging of the ds389_container:
Command issued:
ldapserver:~ # systemctl start dirsrv@frist389
Job for dirsrv@frist389.service failed because the control process exited with error code.
See "systemctl status dirsrv@frist389.service" and "journalctl -xeu dirsrv@frist389.service" for details.
From the journal:
Sep 12 14:39:08 ldapserver systemd[1]: Starting 389 Directory Server frist389....
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.514419404 +0000] - INFO - main - 389-Directory/2.0.19 B2023.226.1200 starting up
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.521442733 +0000] - INFO - main - Setting the maximum file descriptor limit to: 1048576
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.632002781 +0000] - INFO - PBKDF2_SHA256 - Based on CPU performance, chose 3000 rounds
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.636543429 +0000] - NOTICE - bdb_start_autotune - found 2662908k physical memory
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.638732104 +0000] - NOTICE - bdb_start_autotune - found 1599524k available
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.640621690 +0000] - NOTICE - bdb_start_autotune - cache autosizing: db cache: 166431k
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.642381998 +0000] - NOTICE - bdb_start_autotune - total cache size: 136340889 B;
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.644217472 +0000] - ERR - bdb_version_write - Could not open file "/var/lib/dirsrv/slapd-frist389/db/DBVERSION" for writing Netscape Portable Runtime -5966 (Access Denied.)
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.645958188 +0000] - ERR - mkdir_p - /var/lib/dirsrv/slapd-frist389: error -5966 (Access Denied.)
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.648343872 +0000] - CRIT - bdb_start - Can't start because the database directory "/var/lib/dirsrv/slapd-frist389/db" either doesn't exist, or is not accessible
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.650035111 +0000] - ERR - ldbm_back_start - Failed to init database, err=-1 Unexpected dbimpl error code
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.651948432 +0000] - ERR - plugin_dependency_startall - Failed to start database plugin ldbm database
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.654131114 +0000] - CRIT - dblayer_setup - dblayer_init failed
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.655629826 +0000] - ERR - ldbm_back_start - Failed to setup dblayer
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.657162645 +0000] - ERR - plugin_dependency_startall - Failed to start database plugin ldbm database
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.658648068 +0000] - ERR - plugin_dependency_startall - Failed to resolve plugin dependencies
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.660284580 +0000] - ERR - plugin_dependency_startall - betxnpreoperation plugin 7-bit check is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.661978439 +0000] - ERR - plugin_dependency_startall - preoperation plugin Account Usability Plugin is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.663549955 +0000] - ERR - plugin_dependency_startall - accesscontrol plugin ACL Plugin is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.665109855 +0000] - ERR - plugin_dependency_startall - preoperation plugin ACL preoperation is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.666683311 +0000] - ERR - plugin_dependency_startall - betxnpreoperation plugin Auto Membership Plugin is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.668329553 +0000] - ERR - plugin_dependency_startall - object plugin Class of Service is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.669761246 +0000] - ERR - plugin_dependency_startall - preoperation plugin deref is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.671315750 +0000] - ERR - plugin_dependency_startall - database plugin ldbm database is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.672995199 +0000] - ERR - plugin_dependency_startall - betxnpreoperation plugin Linked Attributes is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.674564583 +0000] - ERR - plugin_dependency_startall - betxnpreoperation plugin Managed Entries is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.676118293 +0000] - ERR - plugin_dependency_startall - object plugin Multisupplier Replication Plugin is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.677782794 +0000] - ERR - plugin_dependency_startall - object plugin Roles Plugin is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.679365886 +0000] - ERR - plugin_dependency_startall - object plugin Views is not started
Sep 12 14:39:08 ldapserver ns-slapd[193]: [12/Sep/2023:14:39:08.680968624 +0000] - ERR - plugin_dependency_startall - extendedop plugin whoami is not started
Sep 12 14:39:08 ldapserver systemd[1]: dirsrv@frist389.service: Main process exited, code=exited, status=1/FAILURE
Sep 12 14:39:08 ldapserver systemd[1]: dirsrv@frist389.service: Failed with result 'exit-code'.
Sep 12 14:39:08 ldapserver systemd[1]: Failed to start 389 Directory Server frist389..
After a bit of testing, the solution turned out to be an easy one:
ldapserver:~ # chown dirsrv:dirsrv /var/lib/dirsrv
ldapserver:~ # systemctl start dirsrv@frist389
Checking the processes:
ldapserver:~ # ps axuf | grep slapd
root 420 0.0 0.0 5272 764 pts/2 S+ 14:57 0:00 \_ grep slapd
dirsrv 390 0.9 9.9 1534100 263888 ? Ssl 14:57 0:00 /usr/sbin/ns-slapd -D /etc/dirsrv/slapd-frist389 -i /run/dirsrv/slapd-frist389.pid
Journal:
Sep 12 14:57:01 ldapserver systemd[1]: Starting 389 Directory Server frist389....
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.171806026 +0000] - INFO - main - 389-Directory/2.0.19 B2023.226.1200 starting up
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.179269688 +0000] - INFO - main - Setting the maximum file descriptor limit to: 1048576
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.291601399 +0000] - INFO - PBKDF2_SHA256 - Based on CPU performance, chose 3000 rounds
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.295914712 +0000] - NOTICE - bdb_start_autotune - found 2662908k physical memory
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.298406374 +0000] - NOTICE - bdb_start_autotune - found 1596096k available
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.300780600 +0000] - NOTICE - bdb_start_autotune - cache autosizing: db cache: 166431k
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.302778133 +0000] - NOTICE - bdb_start_autotune - total cache size: 136340889 B;
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.304392868 +0000] - ERR - bdb_version_write - Could not open file "/var/lib/dirsrv/slapd-frist389/db/DBVERSION" for writing Netscape Portable Runtime -5950 (File not found.)
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.644122446 +0000] - INFO - connection_table_new - conntablesize:64000
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.657811333 +0000] - INFO - slapd_daemon - slapd started. Listening on All Interfaces port 389 for LDAP requests
Sep 12 14:57:01 ldapserver ns-slapd[390]: [12/Sep/2023:14:57:01.660519835 +0000] - INFO - slapd_daemon - Listening on /run/slapd-frist389.socket for LDAPI requests
Sep 12 14:57:01 ldapserver systemd[1]: Started 389 Directory Server frist389..
Fix OK.
PR is on the way.
Updated by FSzekely over 1 year ago
The container image creation happens here:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/tests/console/sssd_389ds_functional.pm#L59
The referenced Dockerfile has a line where zypper installs packages, among them is 389-ds:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/data/sssd/398-ds/Dockerfile_docker#L5
When the image gets created by root user the ownership of /var/lib/dirsrv is wrong:
drwxr-x--- 1 root root 0 Aug 15 09:26 dirsrv
When installing 389-ds manually within the container the ownership is fine:
ldapserver:/ # zypper rm 389-ds
Reading installed packages...
Resolving package dependencies...
The following package is going to be REMOVED:
389-ds
1 package to remove.
After the operation, 14.6 MiB will be freed.
Continue? [y/n/v/...? shows all options] (y):
(1/1) Removing 389-ds-2.0.17~git81.849cc42-150400.3.31.1.x86_64 ........................................................................................................................................................................[done]
ldapserver:/ # zypper in 389-ds
Refreshing service 'container-suseconnect-zypp'.
Loading repository data...
Reading installed packages...
Resolving package dependencies...
The following NEW package is going to be installed:
389-ds
1 new package to install.
Overall download size: 3.1 MiB. Already cached: 0 B. After the operation, additional 14.6 MiB will be used.
Continue? [y/n/v/...? shows all options] (y):
Retrieving: 389-ds-2.0.17~git81.849cc42-150400.3.31.1.x86_64 (SLE-Module-Server-Applications15-SP4-Updates for sle-15-x86_64) (1/1), 3.1 MiB
Retrieving: 389-ds-2.0.17~git81.849cc42-150400.3.31.1.x86_64.rpm ...........................................................................................................................................................[done (2.9 MiB/s)]
Checking for file conflicts: ...........................................................................................................................................................................................................[done]
setting /usr/sbin/ns-slapd to root:dirsrv 0750 "cap_net_bind_service=ep". (wrong owner/group root:root permissions 0755, missing capabilities)
(1/1) Installing: 389-ds-2.0.17~git81.849cc42-150400.3.31.1.x86_64 .....................................................................................................................................................................[done]
ldapserver:/ # ls -alrt /var/lib/dirsrv
total 0
drwxr-x--- 1 dirsrv dirsrv 0 Aug 15 09:26 .
drwxr-xr-x 1 root root 158 Sep 13 09:05 ..
I would say that it is acceptable to add correction to the test case, instead of raising a product bug.
Updated by FSzekely over 1 year ago
- Status changed from Workable to Resolved
PR is done:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/17763
Merge needed.