Project

General

Profile

Actions

action #122578

closed

[alert] OpenQA logreport for ariel.suse-dmz.opensuse.org, problems connecting to the database when database shuts down size:M

Added by okurz almost 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2023-01-02
Due date:
2023-01-20
% Done:

0%

Estimated time:

Description

Observation

From 2022-12-31:

[2022-12-31T16:00:05.500773Z] [error] [pid:1593] Unexpected error when updating job 3001504 executed by worker openqaworker4:1: DBIx::Class::Storage::DBI::_exec_txn_commit(): DBI Exception: DBD::Pg::db commit failed: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Needles.pm line 148

[2022-12-31T16:00:05.545694Z] [error] [pid:18320] Unexpected error when updating job 3001833 executed by worker openqaworker7:3: DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. [for Statement "SELECT me.id, me.path, me.name FROM needle_dirs me WHERE ( me.path = ? )" with ParamValues: 1='/var/lib/openqa/share/tests/opensuse/products/opensuse/needles'] at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Needles.pm line 114

[2022-12-31T16:00:05.549336Z] [error] [pid:31119] Unexpected error when updating job 3001427 executed by worker openqaworker4:16: DBIx::Class::Storage::DBI::_exec_txn_commit(): DBI Exception: DBD::Pg::db commit failed: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Needles.pm line 148

[2022-12-31T16:00:05.549414Z] [error] [pid:10945] Unexpected error when updating job 3001372 executed by worker openqaworker1:6: Transaction aborted: DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. [for Statement "SELECT me.id, me.path, me.name FROM needle_dirs me WHERE ( me.path = ? )" with ParamValues: 1='/var/lib/openqa/share/tests/opensuse/products/opensuse/needles'] at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Needles.pm line 114
 Rollback failed: DBIx::Class::Storage::DBI::_exec_txn_rollback(): DBI Exception: DBD::Pg::db rollback failed: no connection to the server at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 159

[2022-12-31T16:00:05.556698Z] [error] [pid:5585] Unexpected error when updating job 3001595 executed by worker openqaworker19:16: Transaction aborted: DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. [for Statement "SELECT me.id, me.dir_id, me.filename, me.last_seen_time, me.last_seen_module_id, me.last_matched_time, me.last_matched_module_id, me.last_updated, me.file_present, me.tags, me.t_created, me.t_updated FROM needles me WHERE ( ( me.dir_id = ? AND me.filename = ? ) )" with ParamValues: 1='9', 2='root-console-20180724.json'] at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Needles.pm line 123
 Rollback failed: DBIx::Class::Storage::DBI::_exec_txn_rollback(): DBI Exception: DBD::Pg::db rollback failed: no connection to the server at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 159

[2022-12-31T16:00:05.557836Z] [error] [pid:21962] Unexpected error when updating job 3000999 executed by worker openqaworker4:20: Transaction aborted: DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. [for Statement "SELECT me.id, me.dir_id, me.filename, me.last_seen_time, me.last_seen_module_id, me.last_matched_time, me.last_matched_module_id, me.last_updated, me.file_present, me.tags, me.t_created, me.t_updated FROM needles me WHERE ( ( me.dir_id = ? AND me.filename = ? ) )" with ParamValues: 1='9', 2='inst-overview-kde-20180807.json'] at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Needles.pm line 123
 Rollback failed: DBIx::Class::Storage::DBI::_exec_txn_rollback(): DBI Exception: DBD::Pg::db rollback failed: no connection to the server at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 159

[2022-12-31T16:00:05.562536Z] [error] [pid:18715] Unexpected error when updating job 3001815 executed by worker openqaworker1:7: Transaction aborted: DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. [for Statement "SELECT me.id, me.dir_id, me.filename, me.last_seen_time, me.last_seen_module_id, me.last_matched_time, me.last_matched_module_id, me.last_updated, me.file_present, me.tags, me.t_created, me.t_updated FROM needles me WHERE ( ( me.dir_id = ? AND me.filename = ? ) )" with ParamValues: 1='9', 2='gnome-terminal-LIVE-20210215.json'] at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Needles.pm line 123
 Rollback failed: DBIx::Class::Storage::DBI::_exec_txn_rollback(): DBI Exception: DBD::Pg::db rollback failed: no connection to the server at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 159

[2022-12-31T16:00:05.565493Z] [error] [pid:2289] Unexpected error when updating job 3000826 executed by worker openqaworker7:4: Transaction aborted: DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. [for Statement "SELECT me.id, me.dir_id, me.filename, me.last_seen_time, me.last_seen_module_id, me.last_matched_time, me.last_matched_module_id, me.last_updated, me.file_present, me.tags, me.t_created, me.t_updated FROM needles me WHERE ( ( me.dir_id = ? AND me.filename = ? ) )" with ParamValues: 1='9', 2='inst-packageinstallationstarted-simplified_UI-20220113.json'] at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Needles.pm line 123
 Rollback failed: DBIx::Class::Storage::DBI::_exec_txn_rollback(): DBI Exception: DBD::Pg::db rollback failed: no connection to the server at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 155

[2022-12-31T16:00:05.567371Z] [error] [pid:31396] Unexpected error when updating job 3000490 executed by worker ip-10-252-32-98:1: DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. [for Statement "UPDATE jobs SET skipped_module_count = skipped_module_count - 1, t_updated = ? WHERE id = ?" with ParamValues: 1='2022-12-31 16:00:05', 2='3000490'] at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Jobs.pm line 1236

[2022-12-31T16:00:05.567708Z] [error] [pid:29288] Unexpected error when updating job 3001223 executed by worker openqaworker4:15: Transaction aborted: DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. [for Statement "SELECT me.id, me.dir_id, me.filename, me.last_seen_time, me.last_seen_module_id, me.last_matched_time, me.last_matched_module_id, me.last_updated, me.file_present, me.tags, me.t_created, me.t_updated FROM needles me WHERE ( ( me.dir_id = ? AND me.filename = ? ) )" with ParamValues: 1='9', 2='displaymanager-gdm-user-prompt-20201120.json'] at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Needles.pm line 123
 Rollback failed: DBIx::Class::Storage::DBI::_exec_txn_rollback(): DBI Exception: DBD::Pg::db rollback failed: no connection to the server at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 159

[2022-12-31T16:00:05.600082Z] [error] [zaeBwZ9Z_q6u] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  the database system is shutting down at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145
…
[2022-12-31T16:00:05.620697Z] [error] [pid:1941] Unexpected error when updating job 3000919 executed by worker openqaworker7:6: DBIx::Class::Storage::DBI::_exec_txn_commit(): DBI Exception: DBD::Pg::db commit failed: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Needles.pm line 148
…
[2022-12-31T16:00:05.647238Z] [error] [pid:28013] Unexpected error when updating job 3001883 executed by worker openqaworker4:8: DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  the database system is shutting down at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Jobs.pm line 1460
…

[2022-12-31T16:00:05.669141Z] [error] [ZhAdj8_QHmOV] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  the database system is shutting down at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145

[2022-12-31T16:00:05.671175Z] [warn] [pid:11823] Unable to verify whether worker 514 runs its job(s) as expected: DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  the database system is shutting down at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/WebSockets/Controller/Worker.pm line 181

[2022-12-31T16:00:05.680413Z] [error] [RCvMdhcWEnLh] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  the database system is shutting down at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145

[2022-12-31T16:00:05.680868Z] [error] [pid:27335] Unexpected error when updating job 3001586 executed by worker openqaworker7:10: DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  the database system is shutting down at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Jobs.pm line 1223



[2022-12-31T16:00:05.688380Z] [error] [pid:2613] Unexpected error when updating job 3001831 executed by worker qa-power8-3:1: DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  the database system is shutting down at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/JobModules.pm line 183

[2022-12-31T16:00:05.689907Z] [error] [V-JN1psyxnSy] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  the database system is shutting down at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145
…

[2022-12-31T16:00:05.764380Z] [error] [QAD4DnnQvmrW] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL:  the database system is shutting down at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Schema/Result/Jobs.pm line 1253
…
[2022-12-31T16:00:05.875734Z] [error] [qsQUUg43-z3n] DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: no connection to the server [for Statement "SELECT me.id, me.result_dir, me.archived, me.state, me.priority, me.result, me.reason, me.clone_id, me.blocked_by_id, me.backend_info, me.TEST, me.DISTRI, me.VERSION, me.FLAVOR, me.ARCH, me.BUILD, me.MACHINE, me.group_id, me.assigned_worker_id, me.t_started, me.t_finished, me.logs_present, me.passed_module_count, me.failed_module_count, me.softfailed_module_count, me.skipped_module_count, me.externally_skipped_module_count, me.scheduled_product_id, me.result_size, me.t_created, me.t_updated, settings.id, settings.key, settings.value, settings.job_id, settings.t_created, settings.t_updated FROM jobs me LEFT JOIN job_settings settings ON settings.job_id = me.id WHERE ( me.id = ? )  ORDER BY me.id" with ParamValues: 1='3001829'] at /usr/share/openqa/script/../lib/OpenQA/WebAPI/Controller/API/V1/Job.pm line 43

[2022-12-31T16:00:06.014599Z] [error] [EGY0ZCyWByyc] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145

[2022-12-31T16:00:06.015322Z] [error] [_o0mlLhZa-Fj] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145

[2022-12-31T16:00:06.019578Z] [error] [iotRxnhwRJmT] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145

[2022-12-31T16:00:06.027060Z] [error] [YmUHBsS_dwlQ] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request. at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/WebAPI/Controller/Test.pm line 569

[2022-12-31T16:00:06.080922Z] [error] [YrL-CwJSnIfB] DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: no connection to the server [for Statement "SELECT me.id, me.key, me.secret, me.user_id, me.t_expiration, me.t_created, me.t_updated FROM api_keys me WHERE ( me.key = ? )" with ParamValues: 1='XXX'] at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145

[2022-12-31T16:00:06.112139Z] [error] [Zujueq11y7rK] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory
    Is the server running locally and accepting connections on that socket? at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145

[2022-12-31T16:00:06.226838Z] [error] [Bkw6wx2Dz7i_] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory
    Is the server running locally and accepting connections on that socket? at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145

[2022-12-31T16:00:06.231040Z] [error] [3s2hW8oTou9V] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: No such file or directory
    Is the server running locally and accepting connections on that socket? at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/WebAPI/Controller/API/V1/Job.pm line 433

[2022-12-31T16:00:06.248453Z] [error] [XtcnY7RUnh67] DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: no connection to the server [for Statement "SELECT me.id, me.key, me.secret, me.user_id, me.t_expiration, me.t_created, me.t_updated FROM api_keys me WHERE ( me.key = ? )" with ParamValues: 1='XXX'] at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145
…
[2022-12-31T16:00:11.772881Z] [error] [z8hqls9h70p-] DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: no connection to the server [for Statement "SELECT me.id, me.key, me.secret, me.user_id, me.t_expiration, me.t_created, me.t_updated FROM api_keys me WHERE ( me.key = ? )" with ParamValues: 1='XXX'] at /usr/share/openqa/script/../lib/OpenQA/Shared/Controller/Auth.pm line 145

Acceptance criteria

  • AC1: o3 works without error messages in logs when the database server is shutdown on purpose
  • AC2: There are still error messages if a database server is not accessible for longer

Suggestions

  • Check log if problem persists including postgres logs
  • Check systemd service files in the case of database service shutdown to ensure that services relying on the database are stopped first, e.g. ensure that if one calls systemctl stop postgres then also openQA stops or something
Actions #1

Updated by okurz almost 2 years ago

  • Description updated (diff)
Actions #2

Updated by okurz almost 2 years ago

  • Tags deleted (infra)
  • Project changed from openQA Infrastructure to openQA Project
  • Subject changed from [alert] OpenQA logreport for ariel.suse-dmz.opensuse.org, something about authentication and other stuff to [alert] OpenQA logreport for ariel.suse-dmz.opensuse.org, problems connecting to the database when database shuts down
  • Description updated (diff)
  • Category set to Regressions/Crashes
  • Priority changed from High to Normal

The problem does not currently happen anymore. All services on o3 are running fine and https://openqa.opensuse.org also looks good. Moved to a software development task.

Actions #3

Updated by okurz almost 2 years ago

  • Subject changed from [alert] OpenQA logreport for ariel.suse-dmz.opensuse.org, problems connecting to the database when database shuts down to [alert] OpenQA logreport for ariel.suse-dmz.opensuse.org, problems connecting to the database when database shuts down size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #4

Updated by jbaier_cz almost 2 years ago

  • Status changed from Workable to In Progress
  • Assignee set to jbaier_cz

It seems to me, that the desired effect can be achieved by https://github.com/os-autoinst/openQA/pull/4971

Actions #5

Updated by openqa_review almost 2 years ago

  • Due date set to 2023-01-20

Setting due date based on mean cycle time of SUSE QE Tools

Actions #7

Updated by jbaier_cz almost 2 years ago

Few observations from a little experiment:

With postgresql inside Requires=

# systemctl stop postgresql
systemd[1]: Stopping Handler for live view in openQA's web UI...
systemd[1]: Stopping The openQA web UI...
systemd[1]: openqa-livehandler.service: Succeeded.
systemd[1]: Stopped Handler for live view in openQA's web UI.
systemd[1]: openqa-webui.service: Succeeded.
systemd[1]: Stopped The openQA web UI.
systemd[1]: Stopping PostgreSQL database server...
systemd[1]: postgresql.service: Succeeded.
systemd[1]: Stopped PostgreSQL database server.

The same situation with postgresql inside Wants=

# systemctl stop postgresql
systemd[1]: Stopping PostgreSQL database server...
systemd[1]: postgresql.service: Succeeded.
systemd[1]: Stopped PostgreSQL database server.
systemd[1]: Stopping Handler for live view in openQA's web UI...
systemd[1]: Stopping The openQA web UI...
systemd[1]: openqa-livehandler.service: Succeeded.
systemd[1]: Stopped Handler for live view in openQA's web UI.
systemd[1]: openqa-webui.service: Succeeded.
systemd[1]: Stopped The openQA web UI.

So indeed, the correct behavior needs usage of Requires= directive. However I spotted two additional things:

  1. Package openQA-local-db includes openqa-setup-db.service which already requires postgresql.service. We can thus require (instead of current wanted) this unit file and provide alternative package openQA-remote-db with openqa-setup-db.service without the postgresql.service dependency.

  2. Unit openqa-gru.service fails without the DB connection and the service file has Restart=on-failure, this will eventually also restart the postgresql.service (openqa-gru wants openqa-setup-db which requires postgresql). So basically the postgresql is unstoppable without stopping the openqa first.

    systemd[1]: openqa-gru.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
    systemd[1]: openqa-gru.service: Failed with result 'exit-code'.
    systemd[1]: openqa-gru.service: Scheduled restart job, restart counter is at 6.
    systemd[1]: Stopped The openQA daemon for various background tasks like cleanup and saving needles.
    systemd[1]: Starting PostgreSQL database server...
    
Actions #8

Updated by okurz almost 2 years ago

But if openQA requires openqa-setup-db then the according package openQA would also require openQA-local-db which is not wanted otherwise both packages would be effectively the same. Maybe the original idea already goes too far. Have you actually checked what triggered the original problem on o3? In the meantime I suggest we revert the original PR https://github.com/os-autoinst/openQA/pull/4975

Actions #9

Updated by jbaier_cz almost 2 years ago

okurz wrote:

But if openQA requires openqa-setup-db then the according package openQA would also require openQA-local-db which is not wanted otherwise both packages would be effectively the same.

My idea is to provide two versions of openqa-setup-db, one in openQA-local-db as it is now, the second inside a new package which will conflicts with openQA-local-db and will not contain the postgresql dependency.

Maybe the original idea already goes too far. Have you actually checked what triggered the original problem on o3?

From the ticket subject I assumed that the original problem is known, my experiment also suggests it is likely the issue.

In the meantime I suggest we revert the original PR https://github.com/os-autoinst/openQA/pull/4975

Yes, I agree. We should revert to prevent ugly surprises during service restart after update.

Actions #10

Updated by jbaier_cz almost 2 years ago

jbaier_cz wrote:

Maybe the original idea already goes too far. Have you actually checked what triggered the original problem on o3?

From the ticket subject I assumed that the original problem is known, my experiment also suggests it is likely the issue.

Problem confirmed:

Dec 31 15:59:53 ariel openqa-continuous-update[21418]: (4/7) Installing: postgresql-server-15-150400.4.6.2.noarch [......
Dec 31 16:00:05 ariel systemd[1]: Stopping PostgreSQL database server...
Dec 31 16:00:05 ariel openqa-webui-daemon[2613]: FATAL:  terminating connection due to administrator command
Dec 31 16:00:06 ariel systemd[1]: postgresql.service: Deactivated successfully.
Dec 31 16:00:06 ariel systemd[1]: Stopped PostgreSQL database server.
Dec 31 16:00:06 ariel systemd[1]: Starting PostgreSQL database server...
Dec 31 16:00:06 ariel systemd[1]: Started PostgreSQL database server.
Dec 31 16:00:10 ariel systemd[1]: openqa-continuous-update.service: Deactivated successfully.
Dec 31 16:00:11 ariel systemd[1]: openqa-gru.service: Failed with result 'exit-code'.
Dec 31 16:00:11 ariel systemd[1]: openqa-gru.service: Scheduled restart job, restart counter is at 1.
Dec 31 16:00:11 ariel systemd[1]: Stopping Handler for live view in openQA's web UI...
Dec 31 16:00:11 ariel systemd[1]: Stopping The openQA web UI...
Dec 31 16:00:11 ariel systemd[1]: Stopped The openQA daemon for various background tasks like cleanup and saving needles.
Dec 31 16:00:11 ariel systemd[1]: Started The openQA daemon for various background tasks like cleanup and saving needles.
Dec 31 16:00:12 ariel systemd[1]: openqa-webui.service: Deactivated successfully.
Dec 31 16:00:12 ariel systemd[1]: Stopped The openQA web UI.
Dec 31 16:00:12 ariel systemd[1]: Started The openQA web UI.

Automatic update and subsequent restart of postgresql server.

Actions #11

Updated by jbaier_cz almost 2 years ago

Maybe there is a simple solution. All we need is to add Requires=postgresql.service if we have openQA-local-db installed, so we can just use the .requires override directory. I will try to prepare a PR for the spec file changes.

Actions #12

Updated by livdywan almost 2 years ago

jbaier_cz wrote:

Maybe there is a simple solution. All we need is to add Requires=postgresql.service if we have openQA-local-db installed, so we can just use the .requires override directory. I will try to prepare a PR for the spec file changes.

Could Wants be used for this? As described here:

Description=Example service
Wants=postgresql.service
After=postgresql.service
Actions #13

Updated by jbaier_cz almost 2 years ago

cdywan wrote:

jbaier_cz wrote:

Maybe there is a simple solution. All we need is to add Requires=postgresql.service if we have openQA-local-db installed, so we can just use the .requires override directory. I will try to prepare a PR for the spec file changes.

Could Wants be used for this? As described here:

Description=Example service
Wants=postgresql.service
After=postgresql.service

No, as seen in the experiments #122653#note-7, the "Example service" will not be stopped if a wanted service fails/stops.

Actions #15

Updated by livdywan almost 2 years ago

  • Status changed from In Progress to Feedback
Actions #16

Updated by jbaier_cz almost 2 years ago

  • Status changed from Feedback to In Progress

See https://github.com/os-autoinst/openQA/pull/4976#issuecomment-1383736623

Apparently the build process for non x86_64 behaves differently, the brp-25-symlink does not like non-existing symlinks and do not skip relative symlinks on other architectures:

[   59s] calling /usr/lib/rpm/brp-suse.d/brp-25-symlink
[   59s] ERROR: link target doesn't exist (neither in build root nor in installed system):
[   59s]   /usr/lib/systemd/system/openqa-gru.service.requires/postgresql.service -> /usr/lib/systemd/system/postgresql.service
[   59s] Add the package providing the target to BuildRequires and Requires
[   59s] ERROR: link target doesn't exist (neither in build root nor in installed system):
[   59s]   /usr/lib/systemd/system/openqa-scheduler.service.requires/postgresql.service -> /usr/lib/systemd/system/postgresql.service
[   59s] Add the package providing the target to BuildRequires and Requires
[   59s] ERROR: link target doesn't exist (neither in build root nor in installed system):
[   59s]   /usr/lib/systemd/system/openqa-websockets.service.requires/postgresql.service -> /usr/lib/systemd/system/postgresql.service

I will try to edit the build requirements: https://github.com/os-autoinst/openQA/pull/4983

Actions #17

Updated by livdywan almost 2 years ago

  • Status changed from In Progress to Feedback

jbaier_cz wrote:

See https://github.com/os-autoinst/openQA/pull/4976#issuecomment-1383736623

Apparently the build process for non x86_64 behaves differently, the brp-25-symlink does not like non-existing symlinks and do not skip relative symlinks on other architectures:

[   59s] calling /usr/lib/rpm/brp-suse.d/brp-25-symlink
[   59s] ERROR: link target doesn't exist (neither in build root nor in installed system):
[   59s]   /usr/lib/systemd/system/openqa-gru.service.requires/postgresql.service -> /usr/lib/systemd/system/postgresql.service
[   59s] Add the package providing the target to BuildRequires and Requires
[   59s] ERROR: link target doesn't exist (neither in build root nor in installed system):
[   59s]   /usr/lib/systemd/system/openqa-scheduler.service.requires/postgresql.service -> /usr/lib/systemd/system/postgresql.service
[   59s] Add the package providing the target to BuildRequires and Requires
[   59s] ERROR: link target doesn't exist (neither in build root nor in installed system):
[   59s]   /usr/lib/systemd/system/openqa-websockets.service.requires/postgresql.service -> /usr/lib/systemd/system/postgresql.service

I will try to edit the build requirements: https://github.com/os-autoinst/openQA/pull/4983

Merged. Let's see if OBS is happy again

Actions #18

Updated by livdywan almost 2 years ago

  • Status changed from Feedback to Resolved

cdywan wrote:

Merged. Let's see if OBS is happy again

Seems to be fine

Actions

Also available in: Atom PDF