Project

General

Profile

Actions

action #131024

closed

coordination #110833: [saga][epic] Scale up: openQA can handle a schedule of 100k jobs with 1k worker instances

coordination #108209: [epic] Reduce load on OSD

Ensure both nginx+apache are properly covered in packages+testing+documentation size:S

Added by okurz over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Motivation

By default we use apache, for o3 we use nginx. Before we recommend to use nginx more we should ensure that nginx is properly tested as part of our various tests. Maybe nginx is already tested as part of the container setup including our config? See https://github.com/os-autoinst/openQA/blob/master/container/webui/docker-compose.yaml#L144 and the use of the related config(s) files

Acceptance criteria

  • AC1: Our apache+nginx config within github.com/os-autoinst/openQA/ are tested as part of automated tests
  • AC2: Both apache+nginx configs are deployed from openSUSE packages
  • AC3: Both apache+nginx are covered in openQA installation documentation

Suggestions

  • Crosscheck the docker-compose based CI test that uses nginx
  • Take a look into Makefile and dist/rpm/openQA.spec for mentions of "apache"
  • Extend installation and packaging instructions to cover both apache and nginx. Likely we can just install both apache and nginx config unconditionally as they don't conflict. Do not create just "openQA-apache" and "openQA-nginx" because "openQA" already contains /etc/apache2/ so just extend that with nginx files?
  • In dist/rpm/openQA.spec the "single-instance" package requires apache. We do not necessarily need to supply both nginx and apache for that so we should likely be ok to just keep apache in there as is
  • We want to support both apache+nginx but it's ok if they are only supported/tested/used for specific use cases
  • Provide nginx config using an openQA subpackage (that one could recommend nginx then)

Rollback steps

  • Remove zypper lock on o3

Related

  • #130477 for generalizing systemd files regarding the webserver

Related issues 4 (0 open4 closed)

Related to openQA Infrastructure (public) - action #132200: openQA is not accessibleResolvedtinita2023-07-02

Actions
Related to openQA Infrastructure (public) - action #132218: Conduct lessons learned for "openQA is not accessible" on 2023-07-02Resolvedokurz2023-07-02

Actions
Blocks openQA Project (public) - action #129487: high response times on osd - Limit the number of concurrent job upload handling on webUI side. Can we use a semaphore or lock using the database? size:MRejectedokurz

Actions
Copied from openQA Project (public) - action #130477: [O3]http connection to O3 repo is broken sporadically in virtualization tests, likely due to systemd dependencies on apache/nginx size:MResolvedmkittler2023-06-07

Actions
Actions #1

Updated by okurz over 1 year ago

  • Copied from action #130477: [O3]http connection to O3 repo is broken sporadically in virtualization tests, likely due to systemd dependencies on apache/nginx size:M added
Actions #2

Updated by okurz over 1 year ago

  • Description updated (diff)
  • Status changed from New to Blocked
  • Assignee set to okurz

waiting for #130477 first

Actions #3

Updated by okurz over 1 year ago

  • Status changed from Blocked to New
  • Assignee deleted (okurz)

As stated by mkittler they see that we should go ahead with this ticket regardless of the state in #130477 so setting back to "New"

Actions #4

Updated by dheidler over 1 year ago

  • Subject changed from Ensure both nginx+apache are properly covered in packages+testing+documentation to Ensure both nginx+apache are properly covered in packages+testing+documentation size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #5

Updated by dheidler over 1 year ago

  • Assignee set to dheidler
Actions #6

Updated by dheidler over 1 year ago

  • Status changed from Workable to In Progress
Actions #9

Updated by dheidler over 1 year ago

Nginx is covered by the docker-compose test.
The config is a bit different from the nginx config used in the rpm package, though.

Actions #10

Updated by dheidler over 1 year ago

  • Status changed from In Progress to Feedback
Actions #11

Updated by okurz over 1 year ago

  • Description updated (diff)

https://github.com/os-autoinst/openQA/pull/5231 broke openqa.opensuse.org with

Jul 02 03:30:43 ariel systemd[1]: Starting The nginx HTTP and reverse proxy server...
Jul 02 03:30:43 ariel nginx[1806]: nginx: [emerg] duplicate upstream "webui" in /etc/nginx/vhosts.d/openqa.conf:3
Jul 02 03:30:43 ariel nginx[1806]: nginx: configuration file /etc/nginx/nginx.conf test failed

And when removing the file conf.d/openqa.conf this yields 403 Forbidden. For now I reverted to the old file conf.d and removed the vhosts file. To prevent an automatic upgrade to break this again I added a zypper lock zypper al -m "https://progress.opensuse.org/issues/131024" openQA*

Actions #12

Updated by tinita over 1 year ago

Actions #13

Updated by tinita over 1 year ago

Also https://openqa.opensuse.org/snapshot-changes/opensuse/Tumbleweed/ was not working.
I needed to start the service explicitly: systemctl start factory-package-news-web.service

Actions #14

Updated by okurz over 1 year ago

  • Related to action #132218: Conduct lessons learned for "openQA is not accessible" on 2023-07-02 added
Actions #15

Updated by okurz over 1 year ago

As brainstormed in #132218 please adhere to the following suggestions:

  1. Come up with a way to structure the config so that there is a file from the package which no admin does not need to manually change anyway and have the instance specific configuration in another layer. Research RPM config handling and test locally, e.g. with a local container, build package locally with osc build and install manually within the test environment with manual rpm calls -> to be covered in #131024
  2. Explicitly test on o3 before merge or just afterwards closely monitor -> to be covered in #131024
  3. Find out if we should reload nginx (and apache) on config file updates from package installations.
Actions #16

Updated by livdywan over 1 year ago

  • Status changed from Feedback to In Progress

Dominik:

  • PR is back to draft
  • preparing the setup and bootstrap scripts to allow testing the nginx config as well
Actions #17

Updated by livdywan over 1 year ago

Actions #20

Updated by openqa_review over 1 year ago

  • Due date set to 2023-07-20

Setting due date based on mean cycle time of SUSE QE Tools

Actions #25

Updated by livdywan over 1 year ago

  • Blocks action #129487: high response times on osd - Limit the number of concurrent job upload handling on webUI side. Can we use a semaphore or lock using the database? size:M added
Actions #26

Updated by dheidler over 1 year ago

on o3:

  • package log removed
  • updated openQA rpm
  • switched to new nginx config
Actions #27

Updated by dheidler over 1 year ago

  • Status changed from In Progress to Blocked

Waiting for changes to reach Factory until we can:

schedule bootstrap test: https://github.com/os-autoinst/opensuse-jobgroups/pull/347

Actions #28

Updated by okurz over 1 year ago

Please use "Blocked" only when you have an URL that people can follow to see what we are waiting for. I think you can reference the build throttled Jenkins build directly

Actions #29

Updated by okurz over 1 year ago

  • Status changed from Blocked to Workable
  • Priority changed from Normal to High
Actions #30

Updated by dheidler over 1 year ago

  • Status changed from Workable to Feedback

Pushed some fixes for the bootstrap script.
Waiting for them to hit factory.

Actions #32

Updated by jbaier_cz over 1 year ago

I am afraid that the configure-web-proxy --proxy=nginx does some stuff which makes nginx not happy: see https://openqa.opensuse.org/tests/3422958#step/openqa_webui/19

Actions #34

Updated by dheidler over 1 year ago

Actions #37

Updated by dheidler over 1 year ago

Ah - seems that https://openqa.opensuse.org/tests/3439538#step/openqa_bootstrap/14 still fails due to fetchneedles using a group that I would guess is installed by apache.

Actions #38

Updated by dheidler over 1 year ago

fix fetchneedles for installations without apache2:
https://github.com/os-autoinst/openQA/pull/5252

Actions #39

Updated by okurz over 1 year ago

  • Due date deleted (2023-07-20)
  • Status changed from Feedback to Blocked

https://github.com/os-autoinst/openQA/pull/5252 merged. I suggest to block this by #132143 for verification

Actions #40

Updated by okurz over 1 year ago

  • Status changed from Blocked to In Progress

o3 is generally up, please try now again.

Actions #41

Updated by openqa_review over 1 year ago

  • Due date set to 2023-08-08

Setting due date based on mean cycle time of SUSE QE Tools

Actions #43

Updated by dheidler over 1 year ago

I think you meant to reference #106922

Actions #44

Updated by dheidler over 1 year ago

  • Status changed from In Progress to Feedback

The referenced issue is blocked by the wheels epic which is also blocked on subtasks.
Which would mean that this ticked would be blocked for an indefinite time frame.

As the fail is not related to the changes and the changed sections actually pass in the testsuite,
we will merge this for now.

Actions #45

Updated by livdywan over 1 year ago

As the fail is not related to the changes and the changed sections actually pass in the testsuite,
we will merge this for now.

See #133301

Actions #46

Updated by okurz over 1 year ago

so in the end https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/134 was merged and I also resolved #133301 meaning that now https://openqa.opensuse.org/group_overview/24 looks stable enough and includes nginx tests. IMHO you can resolve unless we see problems over the night.

Actions #47

Updated by livdywan over 1 year ago

  • Status changed from Feedback to Resolved

okurz wrote:

so in the end https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/134 was merged and I also resolved #133301 meaning that now https://openqa.opensuse.org/group_overview/24 looks stable enough and includes nginx tests. IMHO you can resolve unless we see problems over the night.

👍🏾

Actions #48

Updated by okurz over 1 year ago

  • Due date deleted (2023-08-08)
Actions

Also available in: Atom PDF