Project

General

Profile

Actions

action #130636

open

coordination #110833: [saga][epic] Scale up: openQA can handle a schedule of 100k jobs with 1k worker instances

coordination #108209: [epic] Reduce load on OSD

high response times on osd - Try nginx on OSD size:S

Added by livdywan 11 months ago. Updated 4 days ago.

Status:
Workable
Priority:
Normal
Assignee:
-
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

Apache in prefork mode uses a lot of resources to provide mediocre performance.

Acceptance criteria

  • AC1: Nginx has been deployed successfully on OSD
  • AC2: No alerts regarding "oh no, apache is down" ;)

Suggestions

  • Make sure there is an easy way to switch back to Apache in case something goes wrong
  • See #129490 for results from O3
  • Adapt OSD nginx config for HTTP + HTTPS (O3 only requires HTTP)
  • We can prepare the deployment of nginx in parallel to apache, have it deployed and at any time decide when to switch by just disabling/enabling services accordingly. The deployment needs to consider dehydrated+nginx as well. We can switch OSD to nginx to gather realtime data before we suggest to use nginx as default in our openQA documentation and CI infrastructure.
  • Add changes to salt-states-openqa excluding monitoring
  • Ensure that we have no alerts regarding "oh no, apache is down" ;)
  • If there are any bigger issues observed then just revert and note down in follow-up tickets what needs to be solved first (to limit the ticket to size:S)

Out of scope

  • It is known if Nginx rate limiting features work for our use cases
  • Full monitoring integration

Related issues 4 (1 open3 closed)

Related to openQA Infrastructure - action #157081: OSD unresponsive or significantly slow for some minutes 2024-03-12 08:30ZResolvedokurz2024-03-12

Actions
Related to openQA Infrastructure - action #158059: OSD unresponsive or significantly slow for some minutes 2024-03-26 13:34ZResolvedokurz

Actions
Copied from openQA Project - action #129490: high response times on osd - Try nginx on o3 with enabled load limiting or load balancing featuresResolvedkraih

Actions
Copied to openQA Project - action #159651: high response times on osd - nginx with enabled rate limiting features size:SWorkable2024-04-26

Actions
Actions #1

Updated by livdywan 11 months ago

  • Copied from action #129490: high response times on osd - Try nginx on o3 with enabled load limiting or load balancing features added
Actions #2

Updated by okurz 11 months ago

  • Description updated (diff)
Actions #3

Updated by kraih 11 months ago

  • Description updated (diff)
Actions #4

Updated by kraih 11 months ago

During the openQA weekly we've talked about this ticket and consider it a good candidate for a mob session. Main problems to solve are Salt deployment and SSL configuration. As well as a simple way to rollback the deployment and use Apache again in case something goes wrong.

Actions #5

Updated by kraih 11 months ago

  • Description updated (diff)
Actions #6

Updated by okurz 11 months ago

We can prepare the deployment of nginx in parallel to apache, have it deployed and at any time decide when to switch by just disabling/enabling services accordingly. The deployment needs to consider dehydrated+nginx as well. We can switch OSD to nginx to gather realtime data before we suggest to use nginx as default in our openQA documentation and CI infrastructure.

Actions #7

Updated by okurz about 2 months ago

  • Related to action #157081: OSD unresponsive or significantly slow for some minutes 2024-03-12 08:30Z added
Actions #8

Updated by okurz about 1 month ago

  • Related to action #158059: OSD unresponsive or significantly slow for some minutes 2024-03-26 13:34Z added
Actions #9

Updated by okurz 4 days ago

  • Tags set to infra
  • Target version changed from future to Ready

due to repeated issues with unresponsiveness we should give this more focus and bring it onto the backlog now.

Actions #10

Updated by okurz 4 days ago

  • Copied to action #159651: high response times on osd - nginx with enabled rate limiting features size:S added
Actions #11

Updated by jbaier_cz 4 days ago ยท Edited

  • Subject changed from high response times on osd - Try nginx on osd with enabled load limiting or load balancing features to high response times on osd - Try nginx on OSD size:S
  • Status changed from New to Workable
Actions #12

Updated by jbaier_cz 4 days ago

  • Description updated (diff)
Actions

Also available in: Atom PDF