Project

General

Profile

Actions

action #72139

closed

openQA services on OSD failed to connect to database

Added by mkittler about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Start date:
2020-09-30
Due date:
2020-10-20
% Done:

0%

Estimated time:
Tags:

Description

All openQA services which use the database showed connection errors. That's the first error logged by PostgreSQL:

2020-09-30 12:30:53.437 CEST openqa geekotest [7311]FATAL:  remaining connection slots are reserved for non-replication superuser connections

From the openQA-side the errors look like:

Sep 30 12:47:45 openqa openqa[32459]: [error] [vJyMDc-a] DBIx::Class::Storage::DBI::catch {...} (): DBI Connection failed: DBI connect('dbname=openqa','geekotest',...) failed: FATAL:  remaining connection slots are reserved for non-replication superuser connections at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/Storage/DBI.pm line 1517. at /usr/share/openqa/script/../lib/OpenQA/Schema.pm line 172

This lead to various alerts being triggered (Minion jobs alert, HTTP Response alert, Workers alert). A restart of the main openqa-webui service and posgresql service helped to fix the error. (Likely the restart of openqa-webui was unnecessary considering the other services could restore themselves without a restart.)

I also retried the failed Minion jobs and all of them passed. So there shouldn't be any active warnings anymore.

The question is what caused the connection limit to be exceeded. Theoretically we have a fixed number of services using a fixed number of connections.


Related issues 1 (0 open1 closed)

Related to openQA Project (public) - action #72196: t/24-worker-jobs.t fails in OBSResolvedkraih2020-10-02

Actions
Actions

Also available in: Atom PDF