https://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-04-13T09:18:24ZopenSUSE Project Management ToolopenQA Project - action #91046: CI: "webui-docker-compose" seems that eventually fails againhttps://progress.opensuse.org/issues/91046?journal_id=3969442021-04-13T09:18:24Zilausuchilausuch@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/396944/diff?detail_id=377201">diff</a>)</li></ul> openQA Project - action #91046: CI: "webui-docker-compose" seems that eventually fails againhttps://progress.opensuse.org/issues/91046?journal_id=3969502021-04-13T09:22:21Zilausuchilausuch@suse.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/396950/diff?detail_id=377207">diff</a>)</li></ul> openQA Project - action #91046: CI: "webui-docker-compose" seems that eventually fails againhttps://progress.opensuse.org/issues/91046?journal_id=3969562021-04-13T09:23:00Zilausuchilausuch@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-4 priority-default closed child" href="/issues/89731">action #89731</a>: containers: The deploy using docker-compose is not stable and eventually fails </i> added</li></ul> openQA Project - action #91046: CI: "webui-docker-compose" seems that eventually fails againhttps://progress.opensuse.org/issues/91046?journal_id=3969772021-04-13T10:06:40Zilausuchilausuch@suse.com
<ul></ul><p>Some discoveries</p>
<p>We have in the healthcheck for the DB (<br>
<a href="https://github.com/os-autoinst/openQA/blob/abd9a2297430377cd9876c3cbcec8b2cb4302722/container/webui/docker-compose.yaml#L133" class="external">https://github.com/os-autoinst/openQA/blob/abd9a2297430377cd9876c3cbcec8b2cb4302722/container/webui/docker-compose.yaml#L133</a>)</p>
<pre><code class="text syntaxhl" data-language="text">select * from api_keys;' | psql -U openqa openqa
</code></pre>
<p>This check is not valid because we don't have an error code != 0 </p>
<pre><code class="text syntaxhl" data-language="text">{"Status":"unhealthy","FailingStreak":0,"Log":[{"Start":"2021-04-13T12:01:03.499631783+02:00","End":"2021-04-13T12:01:03.698939023+02:00","ExitCode":0,"Output":"ERROR: relation \"api_keys\" does not exist\nLINE
1: select * from api_keys;\n ^\n"}]}
</code></pre>
<p>Then in spiteof this error, the healthcheck is OK, and docker-compose continues with all the rest of the script.</p>
<p>However, this is a undetected death lock because this table cannot exists until the webui_init starts, and the webui_init cannot start until DB has this table. So the further solution is:</p>
<ul>
<li>Change the healthcheck to something correct for the defined workflow</li>
<li>Check if healthchecks are used on dependences</li>
</ul>
openQA Project - action #91046: CI: "webui-docker-compose" seems that eventually fails againhttps://progress.opensuse.org/issues/91046?journal_id=3969982021-04-13T11:44:50Zilausuchilausuch@suse.com
<ul></ul><p>More investigation</p>
<p>I discovered that the psql doesn't generate an exit 1 when the SQL command fails</p>
<pre><code class="text syntaxhl" data-language="text">root@cd6126e970f3:/# echo 'select * from api_keys2;' | psql -U openqa openqa
ERROR: relation "api_keys2" does not exist
LINE 1: select * from api_keys2;
^
root@cd6126e970f3:/# echo $?
0
</code></pre> openQA Project - action #91046: CI: "webui-docker-compose" seems that eventually fails againhttps://progress.opensuse.org/issues/91046?journal_id=3970012021-04-13T11:56:56Zilausuchilausuch@suse.com
<ul></ul><p>I created this PR <a href="https://github.com/os-autoinst/openQA/pull/3840" class="external">https://github.com/os-autoinst/openQA/pull/3840</a></p>
openQA Project - action #91046: CI: "webui-docker-compose" seems that eventually fails againhttps://progress.opensuse.org/issues/91046?journal_id=3970042021-04-13T12:02:48Zilausuchilausuch@suse.com
<ul></ul><p>In spite of the sequence seems correct now I am getting the same DB errors</p>
<pre><code class="text syntaxhl" data-language="text">db_1 | 2021-04-13 12:00:30.838 UTC [75] LOG: database system was shut down at 2021-04-13 12:00:30 UTC
db_1 | 2021-04-13 12:00:30.843 UTC [1] LOG: database system is ready to accept connections
db_1 | 2021-04-13 12:00:32.556 UTC [83] ERROR: relation "dbix_class_deploymenthandler_versions" does not exist at character 24
db_1 | 2021-04-13 12:00:32.556 UTC [83] STATEMENT: SELECT me.version FROM dbix_class_deploymenthandler_versions me ORDER BY id DESC LIMIT $1
db_1 | 2021-04-13 12:00:32.559 UTC [83] ERROR: relation "dbix_class_deploymenthandler_versions" does not exist at character 24
db_1 | 2021-04-13 12:00:32.559 UTC [83] STATEMENT: SELECT COUNT( * ) FROM dbix_class_deploymenthandler_versions me
db_1 | 2021-04-13 12:00:52.597 UTC [137] ERROR: relation "mojo_migrations" does not exist at character 21
db_1 | 2021-04-13 12:00:52.597 UTC [137] STATEMENT: SELECT version FROM mojo_migrations WHERE name = $1
</code></pre>
<p>I am not sure if this is directly related. However seems doesn't affect to the docker-composer workflow</p>
<pre><code class="text syntaxhl" data-language="text"> Name Command State Ports
----------------------------------------------------------------------------------------------------------------------------------------------
webui_db_1 docker-entrypoint.sh postgres Up (healthy) 5432/tcp
webui_gru_1 sh -c /root/run_openqa.sh| ... Up (healthy) 443/tcp, 80/tcp, 9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp
webui_livehandler_1 /root/run_openqa.sh Up (healthy) 443/tcp, 80/tcp, 9526/tcp, 9527/tcp, 0.0.0.0:9528->9528/tcp, 9529/tcp
webui_nginx_1 /entrypoint.sh Up (healthy) 0.0.0.0:9526->9526/tcp
webui_scheduler_1 /root/run_openqa.sh Up (healthy) 443/tcp, 80/tcp, 9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp
webui_websockets_1 /root/run_openqa.sh Up (healthy) 443/tcp, 80/tcp, 9526/tcp, 0.0.0.0:9527->9527/tcp, 9528/tcp, 9529/tcp
webui_webui_1 /root/run_openqa.sh Up (healthy) 443/tcp, 80/tcp, 0.0.0.0:32793->9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp
webui_webui_2 /root/run_openqa.sh Up (healthy) 443/tcp, 80/tcp, 0.0.0.0:32792->9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp
webui_webui_db_init_1 sh -c chmod -R a+rwX /data ... Up (healthy) 443/tcp, 80/tcp, 0.0.0.0:32791->9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp
</code></pre> openQA Project - action #91046: CI: "webui-docker-compose" seems that eventually fails againhttps://progress.opensuse.org/issues/91046?journal_id=3970582021-04-13T15:20:20Zilausuchilausuch@suse.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" href="/issues/90614">action #90614</a>: CI test webui-docker-compose failed but PR was merged anyway</i> added</li></ul> openQA Project - action #91046: CI: "webui-docker-compose" seems that eventually fails againhttps://progress.opensuse.org/issues/91046?journal_id=3971902021-04-14T07:20:22Zlivdywanliv.dywan@suse.com
<ul></ul><p>ilausuch wrote:</p>
<blockquote>
<p>I created this PR <a href="https://github.com/os-autoinst/openQA/pull/3840" class="external">https://github.com/os-autoinst/openQA/pull/3840</a></p>
</blockquote>
<p>PR got merged</p>
<p><a class="user active user-mention" href="https://progress.opensuse.org/users/34361">@ilausuch</a> The ticket is still <em>New</em>, did you want to update that?</p>
openQA Project - action #91046: CI: "webui-docker-compose" seems that eventually fails againhttps://progress.opensuse.org/issues/91046?journal_id=3971962021-04-14T07:25:14Zilausuchilausuch@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Resolved</i></li></ul>