openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-07-22T10:17:59ZopenSUSE Project Management Tool
Redmine QA - action #95854 (Rejected): Grafana doesn't show information during some minutes, but also we ...https://progress.opensuse.org/issues/958542021-07-22T10:17:59Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>On 2021-07-22 10:14 we can grafana is not showing getting any results of the Webui summary ping<br>
<a href="https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=alert&viewPanel=76&orgId=1&from=1626941518339&to=1626942588243" class="external">https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=alert&viewPanel=76&orgId=1&from=1626941518339&to=1626942588243</a><br>
And we can check that we have increase of CPU after the recovery<br>
<a href="https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=alert&viewPanel=25&orgId=1" class="external">https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=alert&viewPanel=25&orgId=1</a><br>
<a href="https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=alert&viewPanel=25&orgId=1&from=1628060971570&to=1628663436435" class="external">https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?tab=alert&viewPanel=25&orgId=1&from=1628060971570&to=1628663436435</a></p>
<p>During the daily we checked that there weren't a huge quantity of pending jobs to explain the amount of CPU on the recovery.</p>
openQA Project - action #95768 (New): containers: the single_container_test shows this error "Per...https://progress.opensuse.org/issues/957682021-07-21T09:02:09Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Thanks to a previous issue <a class="issue tracker-4 status-3 priority-4 priority-default closed behind-schedule" title="action: [sporadic] containers: eventually the tests fails on single_container_webui step with the error "... (Resolved)" href="https://progress.opensuse.org/issues/95179">#95179</a> we could see in the logs that the webui has permissions problems on the "factory" directory<br>
e.g. <a href="https://openqa.opensuse.org/tests/1830297#step/single_container_webui/107" class="external">https://openqa.opensuse.org/tests/1830297#step/single_container_webui/107</a></p>
<p>The reason is because the directory has the root user (in the container) but the webui is executed with the geekotest user. </p>
<a name="Acceptance-Criteria"></a>
<h2 >Acceptance Criteria<a href="#Acceptance-Criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC 1</strong>: webui can create the needed directories in /var/lib/openqa/share/factory</li>
</ul>
<a name="Suggestion"></a>
<h2 >Suggestion<a href="#Suggestion" class="wiki-anchor">¶</a></h2>
<ul>
<li>Currently if the container doesn't fail the logs are not be printed, so print "docker logs openqa_webui" the logs before destroy the container</li>
<li>Check if there are other directories that require to be created from the webui and can be created</li>
</ul>
openQA Project - action #94255 (New): containers: Improve the speed of the container test in CIhttps://progress.opensuse.org/issues/942552021-06-18T10:48:35Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>The container testing has two main parts: build + run containers. <br>
The building process consume a lot of time and we saw recently several failures because of timeouts <a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" title="action: openqa-in-OpenQA fails in openqa-from-containers (Resolved)" href="https://progress.opensuse.org/issues/93713">#93713</a><br>
The building process takes a lot of time installing packages with zypper (we don't have metrics yet, but is human observation).</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC 1</strong>: Decrease the time to run the container tests</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<p>Because we want to focus on test openQA itself and not the packages we install in our system, I suggest to focus only on open QA testing, and don't care too much about the required packages.<br>
For that, I suggest split each container images in two parts (two different Dockerfiles). One Dockerfile will prepare the base system with the installation of all the packages. The other Docker file will run the openQA code for testing. <br>
The base image could be created during the test, but most interesting, I think is that, we can use a pre build image created by OBS (or other service) every day (or with the frequency we decided) and uploaded tom some registry</p>
openQA Project - action #94030 (Resolved): Cleanup logging in autoinst-log.txt for download assetshttps://progress.opensuse.org/issues/940302021-06-15T12:51:18Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>In the ticket <a class="issue tracker-4 status-3 priority-4 priority-default closed child" title="action: Cleanup logging in autoinst-log.txt (Resolved)" href="https://progress.opensuse.org/issues/91527">#91527</a> we listed other cases to cleanup. One of these is the information of the assets download, e.g.</p>
<pre><code>[2021-05-17T13:49:33.0579 CEST] [info] Download of Tumbleweed.x86_64-1.0-virtualbox-Snapshot20210516.vagrant.virtualbox.box processed:
[info] [#358]
Cache size of "/var/lib/openqa/cache" is 57GiB, with limit 180GiB
[info] [#358]
Downloading "Tumbleweed.x86_64-1.0-virtualbox-Snapshot20210516.vagrant.virtualbox.box" from "http://openqa1-opensuse/tests/1745884/asset/other/Tumbleweed.x86_64-1.0-virtualbox-Snapshot20210516.vagrant.virtualbox.box"
[info] [#358]
Content of "/var/lib/openqa/cache/openqa1-opensuse/Tumbleweed.x86_64-1.0-virtualbox-Snapshot20210516.vagrant.virtualbox.box" has not changed, updating last use
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC 1</strong> : indent with the same indentation criteria than in <a class="issue tracker-4 status-3 priority-4 priority-default closed child" title="action: Cleanup logging in autoinst-log.txt (Resolved)" href="https://progress.opensuse.org/issues/91527">#91527</a></li>
</ul>
<a name="Suggestion"></a>
<h2 >Suggestion<a href="#Suggestion" class="wiki-anchor">¶</a></h2>
<ul>
<li><a href="https://github.com/os-autoinst/os-autoinst/pull/1670" class="external">https://github.com/os-autoinst/os-autoinst/pull/1670</a></li>
<li>See: <a href="https://github.com/os-autoinst/os-autoinst/blob/09824a01348c41641d5cff8cd6d4192ad5ebd7a6/bmwqemu.pm#L199" class="external">https://github.com/os-autoinst/os-autoinst/blob/09824a01348c41641d5cff8cd6d4192ad5ebd7a6/bmwqemu.pm#L199</a></li>
</ul>
openQA Project - action #92893 (Resolved): containers, docker-compose: Ensure that the scheduler ...https://progress.opensuse.org/issues/928932021-05-20T10:34:39Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Trying to complete the task <a class="issue tracker-4 status-12 priority-4 priority-default child" title="action: How to run an openQA test in 5 minutes size:M (Workable)" href="https://progress.opensuse.org/issues/76978">#76978</a> I found a problem in the webui container. It cannot connect to the scheduler and the jobs still unscheduled</p>
<pre><code class="text syntaxhl" data-language="text">scheduler_1 | -- Blocking request (http://127.0.0.1:9527/api/send_job)
scheduler_1 | -- Connect 41d600104970be636ece6296a21d1bdc (http://127.0.0.1:9527)
</code></pre>
<p>It could be easily fixed adding OPENQA_WEB_SOCKETS_HOST: "websockets" to the scheduler declaration</p>
<p>But then an other problems happens:</p>
<pre><code class="text syntaxhl" data-language="text">scheduler_1 | -- Client <<< Server (http://websockets:9527/api/send_job)
scheduler_1 | HTTP/1.1 403 Forbidden\x0d
scheduler_1 | Content-Length: 26\x0d
scheduler_1 | Server: Mojolicious (Perl)\x0d
scheduler_1 | Date: Thu, 20 May 2021 10:24:45 GMT\x0d
scheduler_1 | Content-Type: application/json;charset=UTF-8\x0d
scheduler_1 | \x0d
scheduler_1 | {"error":"Not authorized"}
</code></pre>
<p>This happens when we launch a web UI openQA using the docker-compose, and try to run a job (using clone_job)</p>
<a name="Acceptance-Criteria"></a>
<h2 >Acceptance Criteria<a href="#Acceptance-Criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC 1</strong>: scheduler can connect to websockets without problems</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Investigate the auth method to these services. Maybe the auth method is only localhost</li>
<li>Check that the client.ini has the correct credentials</li>
</ul>
<a name="References"></a>
<h2 >References<a href="#References" class="wiki-anchor">¶</a></h2>
<p>See the comments at <a class="issue tracker-4 status-3 priority-4 priority-default closed behind-schedule" title="action: containers: Web UI cannot connect to scheduler (Resolved)" href="https://progress.opensuse.org/issues/92833#note-6">#92833#note-6</a></p>
openQA Project - action #91752 (Resolved): jenkins: Multiple missing fields and errors in configu...https://progress.opensuse.org/issues/917522021-04-26T11:10:31Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Saving a new configuration is necessary to accomplish <a class="issue tracker-4 status-3 priority-5 priority-high3 closed behind-schedule" title="action: openQA-in-openQA tests always fail and results do not impact submission pipeline (Resolved)" href="https://progress.opensuse.org/issues/88754">#88754</a>. But it's impossible because multiple missing fields and errors in the UI configuration.<br>
It is not possible to save new changes.</p>
<a name="Acceptance-Criteria"></a>
<h2 >Acceptance Criteria<a href="#Acceptance-Criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC 1</strong>: The configuration can be changed and saved</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Configuration url: <a href="http://jenkins.qa.suse.de/view/openQA-in-openQA/job/monitor-openQA_in_openQA-TW/configure" class="external">http://jenkins.qa.suse.de/view/openQA-in-openQA/job/monitor-openQA_in_openQA-TW/configure</a></li>
</ul>
openQA Project - action #91488 (Resolved): containers: openqa test "single_container_webui" event...https://progress.opensuse.org/issues/914882021-04-21T10:09:03Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h1 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h1>
<p>Sometimes we could see this error</p>
<pre><code class="text syntaxhl" data-language="text">Error: No such container: openqa_webui
</code></pre>
<p>This is because the docker run has executed without problems, but one moment later the container doesn't exist</p>
<pre><code class="text syntaxhl" data-language="text">assert_script_run("docker run --rm -d --network testing $volumes $certificates -p 80:80 --name openqa_webui openqa_webui");
wait_for_container_log("openqa_webui", "Web application available at", "docker");
</code></pre>
<p>Examples:</p>
<ul>
<li><a href="https://openqa.opensuse.org/tests/1706491#step/single_container_webui/10" class="external">https://openqa.opensuse.org/tests/1706491#step/single_container_webui/10</a></li>
<li><a href="https://openqa.opensuse.org/tests/1706479#step/single_container_webui/10" class="external">https://openqa.opensuse.org/tests/1706479#step/single_container_webui/10</a></li>
</ul>
<p>We don't have information to know what is the problem</p>
<a name="Acceptance-Criteria"></a>
<h2 >Acceptance Criteria<a href="#Acceptance-Criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC 1</strong>: Log information when this crashes</li>
<li><strong>AC 2</strong>: Fix the problem</li>
</ul>
openQA Project - action #91046 (Resolved): CI: "webui-docker-compose" seems that eventually fails...https://progress.opensuse.org/issues/910462021-04-13T09:08:23Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>In <a class="issue tracker-4 status-3 priority-4 priority-default closed child" title="action: containers: The deploy using docker-compose is not stable and eventually fails (Resolved)" href="https://progress.opensuse.org/issues/89731">#89731</a> we introduced a initial webui container in charge of initializing the database. We have a test where the health check failed<br>
<a href="https://github.com/os-autoinst/openQA/pull/3838/checks?check_run_id=2329551052" class="external">https://github.com/os-autoinst/openQA/pull/3838/checks?check_run_id=2329551052</a></p>
<p>The problem is that the docker-compose exit with an error because the health check of the webuid_db_init container failed</p>
<pre><code class="text syntaxhl" data-language="text"> Name Command State Ports
------------------------------------------------------------------------------------------------------------------------------------------------
webui_db_1 docker-entrypoint.sh postgres Up (healthy) 5432/tcp
webui_webui_db_init_1 sh -c chmod -R a+rwX /data ... Up (unhealthy) 443/tcp, 80/tcp, 0.0.0.0:49153->9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp
make: *** [Makefile:306: test-containers-compose] Error 1
</code></pre>
<p>The healthcheck is this one <br>
<a href="https://github.com/os-autoinst/openQA/blob/abd9a2297430377cd9876c3cbcec8b2cb4302722/container/webui/docker-compose.yaml#L116" class="external">https://github.com/os-autoinst/openQA/blob/abd9a2297430377cd9876c3cbcec8b2cb4302722/container/webui/docker-compose.yaml#L116</a></p>
<p>Take in consideration the DB error lines</p>
<pre><code class="text syntaxhl" data-language="text">db_1 | 2021-04-13 02:43:08.038 UTC [98] ERROR: relation "api_keys" does not exist at character 15
db_1 | 2021-04-13 02:43:08.038 UTC [98] STATEMENT: select * from api_keys;
db_1 | 2021-04-13 02:43:10.441 UTC [100] ERROR: relation "dbix_class_deploymenthandler_versions" does not exist at character 24
db_1 | 2021-04-13 02:43:10.441 UTC [100] STATEMENT: SELECT me.version FROM dbix_class_deploymenthandler_versions me ORDER BY id DESC LIMIT $1
db_1 | 2021-04-13 02:43:10.446 UTC [100] ERROR: relation "dbix_class_deploymenthandler_versions" does not exist at character 24
</code></pre>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC 1</strong>: Determine the cause of the failure</li>
<li><strong>AC 2</strong>: Fix the problem</li>
</ul>
openQA Project - action #90767 (Resolved): containers: Fix github test "webui-docker-compose" tim...https://progress.opensuse.org/issues/907672021-04-07T10:21:10Zilausuchilausuch@suse.com
<a name="Problem"></a>
<h2 >Problem<a href="#Problem" class="wiki-anchor">¶</a></h2>
<p>The test webui-docker-compose exceeds 30 minutes and is cancelled with the message <br>
"The job running on runner Hosted Agent has exceeded the maximum execution time of 30 minutes."</p>
<a name="Acceptance-Criteria"></a>
<h2 >Acceptance Criteria<a href="#Acceptance-Criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC 1</strong>: The test is executed without being cancelled</li>
</ul>
<a name="Resources"></a>
<h2 >Resources<a href="#Resources" class="wiki-anchor">¶</a></h2>
<p>Example of crenellation: <a href="https://github.com/os-autoinst/openQA/actions/runs/725105078" class="external">https://github.com/os-autoinst/openQA/actions/runs/725105078</a></p>
openQA Project - action #89752 (Resolved): containers: Add a worker service as part of the docker...https://progress.opensuse.org/issues/897522021-03-09T16:05:55Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>To have a complete openQA deployment using the docker-compose at least one worker should be created</p>
<a name="Acceptance-Criteria"></a>
<h2 >Acceptance Criteria<a href="#Acceptance-Criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC 1</strong>: Worker service is part of the compose workflow</li>
<li><strong>AC 2</strong>: The number of workers is configurable.</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>See: <a href="https://github.com/os-autoinst/openQA/pull/3755" class="external">https://github.com/os-autoinst/openQA/pull/3755</a></li>
<li>Add a section for the worker service to <code>container/webui/docker-compose.yaml</code></li>
<li>Add a variable like <code>$WORKER_COUNT</code></li>
<li>Add a corresponding healthcheck</li>
</ul>
openQA Project - action #89731 (Resolved): containers: The deploy using docker-compose is not sta...https://progress.opensuse.org/issues/897312021-03-09T11:50:40Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>The command 'docker-compose up' is executed without errors in normal circustancies, but sometimes some of the containers fail later after the docker-compose has finished.</p>
<pre><code>$ docker-compose up -d
Creating webui_db_1 ... done
Creating webui_nginx_1 ... done
Creating webui_data_1 ... done
Creating webui_scheduler_1 ... done
Creating webui_webui_1 ... done
Creating webui_webui_2 ... done
Creating webui_gru_1 ... done
Creating webui_websockets_1 ... done
Creating webui_livehandler_1 ... done
$ echo $?
0
</code></pre><pre><code>docker-compose ps
Name Command State Ports
----------------------------------------------------------------------------------------------------------------------------------------
webui_data_1 /bin/sh -c /usr/bin/tail - ... Up
webui_db_1 docker-entrypoint.sh postgres Up 5432/tcp
webui_gru_1 /root/run_openqa.sh Up 443/tcp, 80/tcp, 9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp
webui_livehandler_1 /root/run_openqa.sh Up 443/tcp, 80/tcp, 9526/tcp, 9527/tcp, 0.0.0.0:9528->9528/tcp, 9529/tcp
webui_nginx_1 /entrypoint.sh Up 0.0.0.0:9526->9526/tcp
webui_scheduler_1 /root/run_openqa.sh Exit 255
webui_websockets_1 /root/run_openqa.sh Up 443/tcp, 80/tcp, 9526/tcp, 0.0.0.0:9527->9527/tcp, 9528/tcp, 9529/tcp
webui_webui_1 /root/run_openqa.sh Up 443/tcp, 80/tcp, 0.0.0.0:32789->9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp
webui_webui_2 /root/run_openqa.sh Up 443/tcp, 80/tcp, 0.0.0.0:32790->9526/tcp, 9527/tcp, 9528/tcp, 9529/tcp
</code></pre>
<p>The errors in schedulers are:</p>
<pre><code>scheduler_1 | failed to run SQL in /usr/share/openqa/script/../dbicdh/PostgreSQL/deploy/90/001-auto-__VERSION.sql: DBIx::Class::DeploymentHandler::DeployMethod::SQL::Translator::try {...} (): DBI Exception: DBD::Pg::db do failed: ERROR: duplicate key value violates unique constraint "pg_type_typname_nsp_index"
scheduler_1 | DETAIL: Key (typname, typnamespace)=(dbix_class_deploymenthandler_versions_id_seq, 2200) already exists. at inline delegation in DBIx::Class::DeploymentHandler for deploy_method->deploy (attribute declared in /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/DeploymentHandler/WithApplicatorDumple.pm at line 51) line 18
scheduler_1 | (running line 'CREATE TABLE dbix_class_deploymenthandler_versions ( id serial NOT NULL, version character varying(50) NOT NULL, ddl text, upgrade_sql text, PRIMARY KEY (id), CONSTRAINT dbix_class_deploymenthandler_versions_version UNIQUE (version) )') at /usr/lib/perl5/vendor_perl/5.26.1/DBIx/Class/DeploymentHandler/DeployMethod/SQL/Translator.pm line 263.
scheduler_1 | DBIx::Class::Storage::TxnScopeGuard::DESTROY(): A DBIx::Class::Storage::TxnScopeGuard went out of scope without explicit commit or error. Rolling back. at /usr/share/openqa/script/openqa-scheduler line 0
scheduler_1 | DBIx::Class::Storage::TxnScopeGuard::DESTROY(): A DBIx::Class::Storage::TxnScopeGuard went out of scope without explicit commit or error. Rolling back. at /usr/share/openqa/script/openqa-scheduler line 0
</code></pre>
<p>The problem is that every container that uses openqa_webui image (webui_webui, webui_websockets, webui_scheduler, webui_livehandler) try to initialize the DB tables. And as all the containers are initialized at the same time surges conflicts.</p>
<a name="Acceptance-Criteria"></a>
<h2 >Acceptance Criteria<a href="#Acceptance-Criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC 1</strong>: All the containers remain up after execute docker-compose up
<del>* <strong>AC 2</strong>: Expand the docker-compose CI test to include this case</del></li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Use dependencies (depends_on) based on health-checks to sort the startup of all the containers.</li>
<li>Check current solution on <a href="https://github.com/os-autoinst/openQA/pull/3755">https://github.com/os-autoinst/openQA/pull/3755</a></li>
</ul>
openQA Project - action #88482 (Resolved): Two absolute paths concatenated to form a default need...https://progress.opensuse.org/issues/884822021-02-08T13:07:40Zilausuchilausuch@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Sometimes a job execution fails with this error</p>
<pre><code>needles_dir not found: /var/lib/openqa/share/tests/opensuse/var/lib/openqa/share/tests/opensuse/products/opensuse/needles (check vars.json?) at /usr/lib/os-autoinst/needle.pm line 330.
</code></pre>
<p>Is concatenating two absolute paths (<a href="https://github.com/os-autoinst/os-autoinst/blob/adbb28bc61ce4f21a55d07399eac7d48badc6b6f/needle.pm#L328" class="external">https://github.com/os-autoinst/os-autoinst/blob/adbb28bc61ce4f21a55d07399eac7d48badc6b6f/needle.pm#L328</a>) when needles directory doesn't exist.</p>
<a name="Reproduction"></a>
<h2 >Reproduction<a href="#Reproduction" class="wiki-anchor">¶</a></h2>
<p>Remove the directory needles from /var/lib/openqa/share/tests/opensuse/products/opensuse</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li>AC1: Needles folder has a working default or aborts if PRODUCT_DIR/needles doesn't exist</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Look into logic in <code>needle.pm</code></li>
<li>Improve error message to reveal relevant variables instead of hinting at <code>vars.json</code> i.e. <code>NEEDLES_DIR</code>, <code>CASEDIR</code> and <code>default_needles_dir</code></li>
<li>Log the missing <code>PRODUCTDIR}/needles</code> if this affects the default nedles folder to be used here</li>
</ul>
<a name="Work-around"></a>
<h2 >Work-around<a href="#Work-around" class="wiki-anchor">¶</a></h2>
<ul>
<li>Create a folder <code>PRODUCTDIR}/needles</code></li>
</ul>
openQA Project - action #88187 (Resolved): Set the addresses in the "internal clients" configurablehttps://progress.opensuse.org/issues/881872021-01-25T10:46:37Zilausuchilausuch@suse.com
<p>Problem:<br>
The listening addresses are hardcoded to localhost within the different "internal clients" (e.g. lib/OpenQA/Scheduler/Client.pm and lib/OpenQA/WebSockets/Client.pm).<br>
This limitation prevents running the different parts of the web UI on different hosts, e.g. a load balanced environment because the different components (scheduler, websockets, …) cannot communicate witch each other.</p>
<p>Suggested solution:<br>
Read an environment variable like OPENQA_SCHEDULER_HOST. This environment variable needs then be supplied to all other containers. <br>
Note: It looks like the livehandler and gru don't have a client. That likely means it is no necessary to care about them as no other services access them (via HTTP).</p>
<p>AC1: The addresses in the "internal clients" are configurable</p>
openQA Project - coordination #81060 (Blocked): [epic] openQA web UI in kuberneteshttps://progress.opensuse.org/issues/810602020-12-15T06:09:11Zilausuchilausuch@suse.com
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>Helm charts allows you to automate the deployment of a complex set of items in a kubernetes environment. These elements are not only limited to pods (containers) but also to configurations (configmaps and secrets), and all the resources they need in the correct order and with the proper checks.</p>
<p>Thanks to the work done in <a class="issue tracker-6 status-15 priority-4 priority-default parent" title="coordination: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containe... (Blocked)" href="https://progress.opensuse.org/issues/80142">#80142</a> we saw how to divide the web UI into parts, which were could be converted into HA and which had to remain standalone. In addition to how we should configure the load balancer to integrate each of the different services that make up the complete web UI.</p>
<p>This ticket proposes to create a helm chart capable of generating a complete and functional deployment of the web UI based on the following prerequisites:</p>
<ul>
<li>There is a pre-existing installation of kubernetes</li>
</ul>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> The complete web UI HA is installed with the DB with the default options</li>
<li><strong>AC2:</strong> The web UI is accessible from outside of the cluster</li>
<li><strong>AC3:</strong> The helm chart is configurable with: Typical and basic parameters and, number of replicas for HA, type of persistence for DB, ...</li>
<li><strong>AC4:</strong> Documentation is completed with instructions of use</li>
<li><strong>AC5:</strong> Deployed together with rancher</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Highly recommended based on work already done in <a class="issue tracker-6 status-15 priority-4 priority-default parent" title="coordination: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containe... (Blocked)" href="https://progress.opensuse.org/issues/80142">#80142</a>, e.g. the existing docker-compose setup</li>
<li><del>Proof-of-concept of either openQA webUI <em>or</em> worker within kubernetes, e.g. using k3s or try rancher directly</del> done for both webUI and worker in <a class="issue tracker-4 status-3 priority-4 priority-default closed child" title="action: [timeboxed:20h][spike] openQA proof-of-concept within kubernetes size:M (Resolved)" href="https://progress.opensuse.org/issues/110524">#110524</a></li>
<li>Use local kubernetes deployments to development purposes (this avoid the infra needs). For instance: minikube, k3s,... </li>
<li>Figure out if is necessary to publish the helm chart and where: <a href="https://helm.sh/docs/howto/chart_releaser_action/">https://helm.sh/docs/howto/chart_releaser_action/</a></li>
<li>Combine with rancher</li>
<li>Ensure proper testing of the charts
<ul>
<li>Create kubernetes cluster inside docker inside GitHub Actions: <a href="https://github.com/marketplace/actions/kind-cluster">https://github.com/marketplace/actions/kind-cluster</a></li>
<li>Look at chart-testing tool: <a href="https://github.com/marketplace/actions/helm-chart-testing">https://github.com/marketplace/actions/helm-chart-testing</a></li>
<li>Try to deploy the chart inside CI</li>
</ul></li>
<li>Add definitions for init containers to allow fetching tests/needles from git repository during installation</li>
<li>As an alternative to git, provide a persistent volume claim template for shared volume (ReadMany) -- think about Longhorn</li>
<li>Add definition for rsyncd container to allow usage of cache service in the worker pod to synchronize data between webui and worker pods</li>
<li>Enhance customization options of the current chart, add common options (like annotations, pod security, replicas count, ...) which are provided by the blank helm templates (reuse initial templates created by <code>helm create</code>)</li>
</ul>
openQA Project - action #65954 (Rejected): Create a way to check which jobs contain a testhttps://progress.opensuse.org/issues/659542020-04-22T06:03:01Zilausuchilausuch@suse.com
<p>Using this url <a href="https://openqa.suse.de/admin/test_suites" class="external">https://openqa.suse.de/admin/test_suites</a> we could find a test. <br>
And therefore in "All tests" page <a href="https://openqa.suse.de/tests" class="external">https://openqa.suse.de/tests</a> page we could try to find this test in last finished jobs list. <br>
But sometimes it doesn't appears because this list is limited to 500 jobs.<br>
I could be interesting to get which jobs contain this test</p>