Project

General

Profile

Actions

action #43715

closed

coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes

coordination #43706: [epic] Generate "download&use" docker image of openQA for SUSE QA

Update upstream dockerfiles to provide an easy to use docker image of workers

Added by SLindoMansilla almost 6 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
Feature requests
Target version:
Start date:
2018-11-13
Due date:
% Done:

0%

Estimated time:

Description

Acceptance criteria

  • AC1: DONE Ensure there is a working Docker file that builds an image
  • AC2: DONE Documented steps to run the worker from the image
  • AC3: DONE The worker can connect to a webui and run tests

Suggestions


Related issues 3 (0 open3 closed)

Related to openQA Project - action #69355: [spike] redundant/load-balancing webui deployments of openQAResolvedilausuch2020-07-25

Actions
Related to openQA Project - action #73450: POC: Create openQA worker container image (feature)Resolvedilausuch2020-10-16

Actions
Blocked by openQA Project - action #43712: Update upstream dockerfiles to provide an easy to use docker image of openQA-webuiResolvedilausuch2018-11-13

Actions
Actions #1

Updated by okurz almost 6 years ago

  • Target version set to Milestone 22
Actions #2

Updated by okurz over 5 years ago

  • Description updated (diff)
Actions #3

Updated by szarate over 5 years ago

  • Related to action #43718: Docker image for webui and workers are versioned and uploaded to obs registry added
Actions #4

Updated by szarate over 5 years ago

  • Related to deleted (action #43718: Docker image for webui and workers are versioned and uploaded to obs registry)
Actions #5

Updated by okurz over 5 years ago

  • Target version changed from Milestone 22 to Milestone 24
Actions #6

Updated by okurz over 5 years ago

  • Subject changed from [functional][u] Update upstream dockerfiles to provide an easy to use docker image of workers to Update upstream dockerfiles to provide an easy to use docker image of workers
  • Category set to Feature requests
  • Priority changed from Normal to Low
  • Target version deleted (Milestone 24)
Actions #7

Updated by ilausuch about 4 years ago

  • Assignee set to ilausuch
Actions #8

Updated by okurz about 4 years ago

  • Target version set to Ready
Actions #9

Updated by livdywan about 4 years ago

  • Blocked by action #43712: Update upstream dockerfiles to provide an easy to use docker image of openQA-webui added
Actions #10

Updated by livdywan about 4 years ago

Let's consider this Blocked in the sense that the steps required are the same with a focus on the worker vs. the web UI.

Actions #11

Updated by livdywan almost 4 years ago

  • Status changed from Workable to Blocked
Actions #12

Updated by ilausuch almost 4 years ago

Before to work on this ticket I would like to complete this one https://progress.opensuse.org/issues/69355 because it's related and maybe dependent on how this is resolved

Actions #13

Updated by livdywan almost 4 years ago

  • Related to action #69355: [spike] redundant/load-balancing webui deployments of openQA added
Actions #14

Updated by ilausuch almost 4 years ago

  • Status changed from Blocked to In Progress
Actions #15

Updated by ilausuch almost 4 years ago

One interesting thing should we consider is about the --link parameter in the docker run for the workers. There is an alert about that in this link https://docs.docker.com/network/links/

Actions #16

Updated by ilausuch almost 4 years ago

Two initial problems to fix during the build

  • Package 'qemu-uefi-aarch64' not found
  • /root/qemu/kvm-mknod.sh: line 6: gunzip: command not found
Actions #18

Updated by ilausuch almost 4 years ago

New problems to solve

  • gzip: /proc/config.gz: No such file or directory
  • mknod: /dev/kvm: File exists
  • Unable to make /dev/kvm node; software emulation will be used (This can happen if the container is run without -privileged)

In the last case, I think is because of configuration

Actions #19

Updated by okurz almost 4 years ago

  • Related to action #73450: POC: Create openQA worker container image (feature) added
Actions #20

Updated by livdywan almost 4 years ago

  • Description updated (diff)
Actions #21

Updated by okurz almost 4 years ago

  • Due date set to 2020-10-28

As I think you are actually working on this we should aim for not exceeding our usual cycle times for tickets, hence setting the due date to what I consider feasible and useful. Please make sure to provide an update soon and feel free to unassign again if you are not actually (anymore?) working on this

Actions #22

Updated by ilausuch almost 4 years ago

Doing some tests on the worker I realized that the worker cannot start because:

[info] [pid:44] Project dir for host http://webui_haproxy_1 is /var/lib/openqa/share
[info] [pid:44] Registering with openQA http://webui_haproxy_1
[info] [pid:44] Establishing ws connection via ws://webui_haproxy_1/api/v1/ws/1
[warn] [pid:44] Unable to upgrade to ws connection via http://webui_haproxy_1/api/v1/ws/1 - trying again in 10 seconds

Facts:

  • The worker can connect to the web UI API and authentificate the user
  • The worker cannot connect to the websockets

To solve that Christian and me were working on a replacement for haproxy with nginx allowing the reverse proxy for the websokets. This seems to work

server {
  listen       80;
  listen       9526;
  server_name  localhost;

  location ~ /api/v1/ws/(.*) {
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header Host $http_host;
    proxy_set_header X-NginX-Proxy true;

    rewrite ^//api/v1/ws/(.*)$ http://webui_websockets_1:9527/ws/$1;
    proxy_pass http://webui_websockets_1:9527;
    proxy_redirect off;
  }

  location / {
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header Host $http_host;
    proxy_set_header X-NginX-Proxy true;

    proxy_pass http://webui_haproxy_1:9526;
    proxy_redirect off;
  }
}

Facts:

  • nginx pass to the websockets the query
  • nginx always return Not Found in plain text (tested from the worker, and also in the Christian environment)
  • Christian checked that in O3 the websockets service also returns Not Found for the same path /api/v1/ws/1 however in O3 the worker is starting and in this test not. Why is the reason for that?

Ideas:

  • check in a fake websockets service that nginx is sending the correct path
  • Check what the worker spect from the websockets

Closely related with https://progress.opensuse.org/issues/69355. Both tickets are interdependent

Actions #23

Updated by ilausuch almost 4 years ago

  • Status changed from In Progress to Feedback
Actions #24

Updated by mkittler almost 4 years ago

I see that your NGINX config differs from what we have in our repository: https://github.com/os-autoinst/openQA/blob/master/etc/nginx/vhosts.d/openqa.conf

Maybe the lack of proxy_set_header Upgrade leads to the 404 response.

As mentioned in the chat yesterday: It is also possible to specify the web UI port directly within the worker config. The worker will then use this port +2 for the web sockets route making it unnecessary to proxy the web socket connection. Having NGINX proxying the web socket connection would be the nicer solution of course.

Actions #25

Updated by ilausuch almost 4 years ago

Works, preparing the https://progress.opensuse.org/issues/69355 finishing before

Actions #26

Updated by ilausuch almost 4 years ago

  • Status changed from Feedback to In Progress
Actions #27

Updated by ilausuch almost 4 years ago

Thanks Marius, this worked and now we have a new PR for the web UI that solves all these problems. Now I am preparing this PR to use api keys and hosts for use with https://progress.opensuse.org/issues/69355

Actions #28

Updated by ilausuch almost 4 years ago

Additionally and following the subject of this task "provide an easy to use..." I created an script to launch a pool of workers
https://github.com/os-autoinst/openQA/pull/3495

Actions #29

Updated by livdywan almost 4 years ago

  • Description updated (diff)
Actions #30

Updated by ilausuch almost 4 years ago

  • Description updated (diff)

The acceptance criteria AC1, AC2 and AC3 are covered by https://github.com/os-autoinst/openQA/pull/3475 and https://github.com/os-autoinst/openQA/pull/3495
My next step is to build the docker image

Actions #32

Updated by livdywan almost 4 years ago

  • Due date changed from 2020-10-28 to 2020-11-13

For the record, these questions were still being discussed today:

  • We might need different containers for specific architectures with their own tags
  • What base image to use
  • Fetching repos via http://download.opensuse.org

Setting up working builds in a home project was far from straightforward even with OBS expertise on hand so I recommend this gets documented in a blog post or wiki page after, although it's not required to finish the ticket.

Actions #33

Updated by pdostal almost 4 years ago

As well as I know, no tags are required for different architectures. There can be multiple builds of the same Dockerfile with the same tag for different architectures.

Actions #35

Updated by livdywan almost 4 years ago

  • Description updated (diff)
  • Status changed from In Progress to Feedback

Publishing images on OBS is raising a lot of questions on top of a usable image that have nothing to do with containerizing the worker in general, and we actually have #43718 so I'm removing it from the AC here.

Actions #37

Updated by okurz almost 4 years ago

Please keep in mind, even if this is not explictly mentioned: We probably all agree that without automatic tests we would not call any contributions properly long-term supportable. These tests could be very simple, e.g. something like podman run --rm -it .... openqa-worker --help or something. But if you do not plan to add tests as part of this ticket which is of course ok then please create a follow-up so that we don't forget that.

Actions #39

Updated by livdywan almost 4 years ago

okurz wrote:

Please keep in mind, even if this is not explictly mentioned: We probably all agree that without automatic tests we would not call any contributions properly long-term supportable. These tests could be very simple, e.g. something like podman run --rm -it .... openqa-worker --help or something. But if you do not plan to add tests as part of this ticket which is of course ok then please create a follow-up so that we don't forget that.

Ack. The specific AC however had build and publish in OBS in it (which #43718 basically is), and we don't even know that we can run tests in OBS after building the image. Although adding two lines in our existing setup might be more straightforward.

Actions #40

Updated by ilausuch almost 4 years ago

  • Status changed from Feedback to Resolved

The PR is merged and there is a build in OBS (https://build.opensuse.org/package/show/home:ilausuch:branches:devel:openQA/openQA_container_image_worker_x86) that builds the container image

Actions #41

Updated by okurz almost 4 years ago

unfortunately it seems you overlooked #43715#note-37 . I created #43706 for that now.

Actions #42

Updated by okurz over 3 years ago

  • Due date deleted (2020-11-13)
Actions

Also available in: Atom PDF