Project

General

Profile

Actions

action #89731

closed

coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes

coordination #89842: [epic] Scalable and streamlined docker-compose based openQA setup

containers: The deploy using docker-compose is not stable and eventually fails

Added by ilausuch almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2021-03-09
Due date:
% Done:

0%

Estimated time:

Description

Motivation

The command 'docker-compose up' is executed without errors in normal circustancies, but sometimes some of the containers fail later after the docker-compose has finished.

$ docker-compose up -d
Creating webui_db_1    ... done
Creating webui_nginx_1       ... done
Creating webui_data_1  ... done
Creating webui_scheduler_1   ... done
Creating webui_webui_1       ... done
Creating webui_webui_2       ... done
Creating webui_gru_1         ... done
Creating webui_websockets_1  ... done
Creating webui_livehandler_1 ... done
$ echo $?
0
docker-compose ps
       Name                      Command                State                                     Ports                                 
----------------------------------------------------------------------------------------------------------------------------------------
webui_data_1          /bin/sh -c /usr/bin/tail - ...   Up                                                                               
webui_db_1            docker-entrypoint.sh postgres    Up         5432/tcp                                                              
webui_gru_1           /root/run_openqa.sh              Up         443/tcp, 80/tcp, 9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp               
webui_livehandler_1   /root/run_openqa.sh              Up         443/tcp, 80/tcp, 9526/tcp, 9527/tcp, 0.0.0.0:9528->9528/tcp, 9529/tcp 
webui_nginx_1         /entrypoint.sh                   Up         0.0.0.0:9526->9526/tcp                                                
webui_scheduler_1     /root/run_openqa.sh              Exit 255                                                                         
webui_websockets_1    /root/run_openqa.sh              Up         443/tcp, 80/tcp, 9526/tcp, 0.0.0.0:9527->9527/tcp, 9528/tcp, 9529/tcp 
webui_webui_1         /root/run_openqa.sh              Up         443/tcp, 80/tcp, 0.0.0.0:32789->9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp
webui_webui_2         /root/run_openqa.sh              Up         443/tcp, 80/tcp, 0.0.0.0:32790->9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp

The errors in schedulers are:

scheduler_1    | failed to run SQL in /usr/share/openqa/script/../dbicdh/PostgreSQL/deploy/90/001-auto-__VERSION.sql: DBIx::Class::DeploymentHandler::DeployMethod::SQL::Translator::try {...} (): DBI Exception: DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "pg_type_typname_nsp_index"
scheduler_1    | DETAIL:  Key (typname, typnamespace)=(dbix_class_deploymenthandler_versions_id_seq, 2200) already exists. at inline delegation in DBIx::Class::DeploymentHandler for deploy_method->deploy (attribute declared in /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/DeploymentHandler/WithApplicatorDumple.pm at line 51) line 18
scheduler_1    |  (running line 'CREATE TABLE dbix_class_deploymenthandler_versions ( id serial NOT NULL, version character varying(50) NOT NULL, ddl text, upgrade_sql text, PRIMARY KEY (id), CONSTRAINT dbix_class_deploymenthandler_versions_version UNIQUE (version) )') at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/DeploymentHandler/DeployMethod/SQL/Translator.pm line 263.
scheduler_1    | DBIx::Class::Storage::TxnScopeGuard::DESTROY(): A DBIx::Class::Storage::TxnScopeGuard went out of scope without explicit commit or error. Rolling back. at /usr/share/openqa/script/openqa-scheduler line 0
scheduler_1    | DBIx::Class::Storage::TxnScopeGuard::DESTROY(): A DBIx::Class::Storage::TxnScopeGuard went out of scope without explicit commit or error. Rolling back. at /usr/share/openqa/script/openqa-scheduler line 0

The problem is that every container that uses openqa_webui image (webui_webui, webui_websockets, webui_scheduler, webui_livehandler) try to initialize the DB tables. And as all the containers are initialized at the same time surges conflicts.

Acceptance Criteria

  • AC 1: All the containers remain up after execute docker-compose up * AC 2: Expand the docker-compose CI test to include this case

Suggestions


Related issues 3 (0 open3 closed)

Related to openQA Project (public) - action #91046: CI: "webui-docker-compose" seems that eventually fails againResolvedilausuch2021-04-13

Actions
Blocked by openQA Project (public) - action #89719: docker-compose up fails on masterResolvedilausuch2021-03-09

Actions
Blocked by openQA Project (public) - action #89722: Need automatic check for docker-composeResolvedilausuch2021-03-09

Actions
Actions

Also available in: Atom PDF